History log of /netbsd-current/sys/kern/init_main.c
Revision (<<< Hide revision tags) (Show revision tags >>>) Date Author Comments
# 1.549 05-Mar-2024 thorpej

Revert previous until I can diagnose a failure reported by gson.


# 1.548 05-Mar-2024 thorpej

Early in main(), assert that curcpu() evaluates as the primary CPU and
stash away a pointer to it as the boot CPU for quick reference later.


# 1.547 17-Jan-2024 hannken

Protect kernel hooks exechook, exithook and forkhook with rwlock.
Lock as writer on establish/disestablish and as reader on list traverse.

For exechook ride "exec_lock" as it is already take as reader when
traversing the list. Add local locks for exithook and forkhook.

Move exec_init before signal_init as signal_init calls exechook_establish()
that needs "exec_lock".

PR kern/39913 "exec, fork, exit hooks need locking"


Revision tags: thorpej-ifq-base thorpej-altq-separation-base
# 1.546 23-Sep-2023 ad

Repply this change with a couple of bugs fixed:

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.545 12-Sep-2023 ad

Back out recent change to replace pool_cache with then general allocator.
Will return to this when I have time again.


# 1.544 10-Sep-2023 ad

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.543 02-Sep-2023 riastradh

heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h.

Less error-prone this way, and the callers are less cluttered.


# 1.542 07-Jul-2023 riastradh

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout


Revision tags: netbsd-10-0-RC5 netbsd-10-0-RC4 netbsd-10-0-RC3 netbsd-10-0-RC2 netbsd-10-0-RC1 netbsd-10-base
# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.549 05-Mar-2024 thorpej

Revert previous until I can diagnose a failure reported by gson.


# 1.548 05-Mar-2024 thorpej

Early in main(), assert that curcpu() evaluates as the primary CPU and
stash away a pointer to it as the boot CPU for quick reference later.


# 1.547 17-Jan-2024 hannken

Protect kernel hooks exechook, exithook and forkhook with rwlock.
Lock as writer on establish/disestablish and as reader on list traverse.

For exechook ride "exec_lock" as it is already take as reader when
traversing the list. Add local locks for exithook and forkhook.

Move exec_init before signal_init as signal_init calls exechook_establish()
that needs "exec_lock".

PR kern/39913 "exec, fork, exit hooks need locking"


Revision tags: thorpej-ifq-base thorpej-altq-separation-base
# 1.546 23-Sep-2023 ad

Repply this change with a couple of bugs fixed:

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.545 12-Sep-2023 ad

Back out recent change to replace pool_cache with then general allocator.
Will return to this when I have time again.


# 1.544 10-Sep-2023 ad

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.543 02-Sep-2023 riastradh

heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h.

Less error-prone this way, and the callers are less cluttered.


# 1.542 07-Jul-2023 riastradh

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout


Revision tags: netbsd-10-0-RC5 netbsd-10-0-RC4 netbsd-10-0-RC3 netbsd-10-0-RC2 netbsd-10-0-RC1 netbsd-10-base
# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.547 17-Jan-2024 hannken

Protect kernel hooks exechook, exithook and forkhook with rwlock.
Lock as writer on establish/disestablish and as reader on list traverse.

For exechook ride "exec_lock" as it is already take as reader when
traversing the list. Add local locks for exithook and forkhook.

Move exec_init before signal_init as signal_init calls exechook_establish()
that needs "exec_lock".

PR kern/39913 "exec, fork, exit hooks need locking"


Revision tags: thorpej-ifq-base thorpej-altq-separation-base
# 1.546 23-Sep-2023 ad

Repply this change with a couple of bugs fixed:

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.545 12-Sep-2023 ad

Back out recent change to replace pool_cache with then general allocator.
Will return to this when I have time again.


# 1.544 10-Sep-2023 ad

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.543 02-Sep-2023 riastradh

heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h.

Less error-prone this way, and the callers are less cluttered.


# 1.542 07-Jul-2023 riastradh

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout


Revision tags: netbsd-10-0-RC3 netbsd-10-0-RC2 netbsd-10-0-RC1 netbsd-10-base
# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.546 23-Sep-2023 ad

Repply this change with a couple of bugs fixed:

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.545 12-Sep-2023 ad

Back out recent change to replace pool_cache with then general allocator.
Will return to this when I have time again.


# 1.544 10-Sep-2023 ad

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.543 02-Sep-2023 riastradh

heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h.

Less error-prone this way, and the callers are less cluttered.


# 1.542 07-Jul-2023 riastradh

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout


Revision tags: netbsd-10-base
# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.545 12-Sep-2023 ad

Back out recent change to replace pool_cache with then general allocator.
Will return to this when I have time again.


# 1.544 10-Sep-2023 ad

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.543 02-Sep-2023 riastradh

heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h.

Less error-prone this way, and the callers are less cluttered.


# 1.542 07-Jul-2023 riastradh

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout


Revision tags: netbsd-10-base
# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.545 12-Sep-2023 ad

Back out recent change to replace pool_cache with then general allocator.
Will return to this when I have time again.


# 1.544 10-Sep-2023 ad

- Do away with separate pool_cache for some kernel objects that have no special
requirements and use the general purpose allocator instead. On one of my
test systems this makes for a small (~1%) but repeatable reduction in system
time during builds presumably because it decreases the kernel's cache /
memory bandwidth footprint a little.
- vfs_lockf: cache a pointer to the uidinfo and put mutex in the data segment.


# 1.543 02-Sep-2023 riastradh

heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h.

Less error-prone this way, and the callers are less cluttered.


# 1.542 07-Jul-2023 riastradh

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout


Revision tags: netbsd-10-base
# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.543 02-Sep-2023 riastradh

heartbeat(9): Move #ifdef HEARTBEAT to sys/heartbeat.h.

Less error-prone this way, and the callers are less cluttered.


# 1.542 07-Jul-2023 riastradh

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout


Revision tags: netbsd-10-base
# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.542 07-Jul-2023 riastradh

heartbeat(9): New mechanism to check progress of kernel.

This uses hard interrupts to check progress of low-priority soft
interrupts, and one CPU to check progress of another CPU.

If no progress has been made after a configurable number of seconds
(kern.heartbeat.max_period, default 15), then the system panics --
preferably on the CPU that is stuck so we get a stack trace in dmesg
of where it was stuck, but if the stuckness was detected by another
CPU and the stuck CPU doesn't acknowledge the request to panic within
one second, the detecting CPU panics instead.

This doesn't supplant hardware watchdog timers. It is possible for
hard interrupts to be stuck on all CPUs for some reason too; in that
case heartbeat(9) has no opportunity to complete.

Downside: heartbeat(9) relies on hardclock to run at a reasonably
consistent rate, which might cause trouble for the glorious tickless
future. However, it could be adapted to take a parameter for an
approximate number of units that have elapsed since the last call on
the current CPU, rather than treating that as a constant 1.

XXX kernel revbump -- changes struct cpu_info layout


Revision tags: netbsd-10-base
# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.541 26-Oct-2022 riastradh

kern/init_main.c: Get extern lwp0 from sys/lwp.h.


Revision tags: bouyer-sunxi-drm-base
# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.540 21-Jul-2022 simonb

Removed unused opt_wapbl.h include.


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.539 18-Jun-2022 andvar

fix typos in word "functions" in comments, mainly s/fuctions/functions/.


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.538 19-Mar-2022 hannken

Fix locking after opendisk(), VOP_IOCTL() needs an unlocked vnode,
vn_rdwr() needs flag IO_NODELOCKED.


# 1.537 18-Mar-2022 riastradh

entropy(9): Establish the softint a little earlier.

Just need to wait until softint_establish and high-priority xcalls
will work, no later than that. Doing this earlier gives us slightly
more of a chance to ensure cprng_fast and ssp get entropy from
hardware RNG devices that rely on interrupts.


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.536 26-Jan-2022 andvar

remove double t from targeted, add missing r to arbitrary
And fix few more typos along the way in comments and man pages.


Revision tags: thorpej-i2c-spi-conf2-base thorpej-futex2-base thorpej-cfargs2-base cjep_sun2x-base1 cjep_sun2x-base cjep_staticlib_x-base1 cjep_staticlib_x-base thorpej-i2c-spi-conf-base thorpej-cfargs-base thorpej-futex-base
# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


# 1.534 05-Dec-2020 thorpej

branches: 1.534.2;
Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.535 01-Apr-2021 simonb

Expose olde style intrcnt interrupt accounting via event counters.
This code will be garbage collected once our last legacy intrcnt
user is update to native evcnts.


Revision tags: thorpej-cfargs-base thorpej-futex-base
# 1.534 05-Dec-2020 thorpej

Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


# 1.531 08-Sep-2020 riastradh

branches: 1.531.2;
ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.534 05-Dec-2020 thorpej

Refactor interval timers to make it possible to support types other than
the BSD/POSIX per-process timers:

- "struct ptimer" is split into "struct itimer" (common interval timer
data) and "struct ptimer" (per-process timer data, which contains a
"struct itimer").

- Introduce a new "struct itimer_ops" that supplies information about
the specific kind of interval timer, including it's processing
queue, the softint handle used to schedule processing, the function
to call when the timer fires (which adds it to the queue), and an
optional function to call when the CLOCK_REALTIME clock is changed by
a call to clock_settime() or settimeofday().

- Rename some fuctions to clearly identify what they're operating on
(ptimer vs itimer).

- Use kmem(9) to allocate ptimer-related structures, rather than having
dedicated pools for them.

Welcome to NetBSD 9.99.77.


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


Revision tags: thorpej-futex-base
# 1.531 08-Sep-2020 riastradh

ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

branches: 1.504.2;
Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.533 12-Nov-2020 simonb

Set a better default for MAXFILES on larger RAM machines if not
otherwise specified the kernel config file. Arbitary numbers are
20,000 files for 16GB RAM or more and 10,000 files for 1GB RAM or
more.

TODO: Adjust this and other values totally dynamically.


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


Revision tags: thorpej-futex-base
# 1.531 08-Sep-2020 riastradh

ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.532 04-Nov-2020 chs

In uvmpd_tryownerlock(), if the initial try-lock of the owner lock fails
then rather than do more try-locks and eventually sleep for a tick,
take a hold on the current owner's lock, drop the page interlock,
and acquire the lock that we took the hold on in a blocking fashion.
After we get the lock, check if the lock that we acquired is still
the lock for the owner of the page that we're interested in.
If the owner hasn't changed then can proceed with this page,
otherwise we will skip this page and move on to a different page.
This dramatically reduces the amount of time that the pagedaemon
sleeps trying to get locks, since even 1 tick is an eternity to sleep
in this context and it was easy to trigger that case in practice,
and with this new method the pagedaemon only very rarely actually blocks
to acquire the lock that it wants since the object locks are adaptive,
and when the pagedaemon does block then the amount of time it spends
sleeping will be generally be much less than 1 tick.


Revision tags: thorpej-futex-base
# 1.531 08-Sep-2020 riastradh

ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-1-RELEASE netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.531 08-Sep-2020 riastradh

ipi: Split up initialization into two parts.

First part runs early so ipi_register can be used in module
initialization, e.g. via pktqueue_create; second part runs after CPUs
have been detected.


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.530 07-Sep-2020 thorpej

Add the ability to set an alternate cnmagic in the kernel config
file, e.g.:

options CNMAGIC="\"+++++\""


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.529 27-Aug-2020 riastradh

Move address hashing from init_main.c to kern_sysctl.c.

This way rump gets it automatically. Make sure blake2s is in
librumpkern.so, not just in librumpkern_crypto.so, for this to work.


# 1.528 26-Aug-2020 christos

Instead of returning 0 when sysctl kern.expose_address=0, return a random
hashed value of the data. This allows sockstat to work without exposing
kernel addresses or being setgid kmem.


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.527 11-Jun-2020 ad

uvm_availmem(): give it a boolean argument to specify whether a recent
cached value will do, or if the very latest total must be fetched. It can
be called thousands of times a second and fetching the totals impacts not
only the calling LWP but other CPUs doing unrelated activity in the VM
system.


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.526 23-May-2020 ad

Move proc_lock into the data segment. It was dynamically allocated because
at the time we had mutex_obj_alloc() but not __cacheline_aligned.


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.525 11-May-2020 riastradh

Move cprng_init before configure.

This makes it available to device drivers, e.g. to generate MAC
addresses at random, without initialization order hacks.

Requires a minor initialization hack for cpu_name(primary cpu) early
on, since that doesn't get set until mi_cpu_attach which may not run
until the middle of configure. But this hack is less bad than other
initialization order hacks.


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.524 30-Apr-2020 riastradh

Rewrite entropy subsystem.

Primary goals:

1. Use cryptography primitives designed and vetted by cryptographers.
2. Be honest about entropy estimation.
3. Propagate full entropy as soon as possible.
4. Simplify the APIs.
5. Reduce overhead of rnd_add_data and cprng_strong.
6. Reduce side channels of HWRNG data and human input sources.
7. Improve visibility of operation with sysctl and event counters.

Caveat: rngtest is no longer used generically for RND_TYPE_RNG
rndsources. Hardware RNG devices should have hardware-specific
health tests. For example, checking for two repeated 256-bit outputs
works to detect AMD's 2019 RDRAND bug. Not all hardware RNGs are
necessarily designed to produce exactly uniform output.

ENTROPY POOL

- A Keccak sponge, with test vectors, replaces the old LFSR/SHA-1
kludge as the cryptographic primitive.

- `Entropy depletion' is available for testing purposes with a sysctl
knob kern.entropy.depletion; otherwise it is disabled, and once the
system reaches full entropy it is assumed to stay there as far as
modern cryptography is concerned.

- No `entropy estimation' based on sample values. Such `entropy
estimation' is a contradiction in terms, dishonest to users, and a
potential source of side channels. It is the responsibility of the
driver author to study the entropy of the process that generates
the samples.

- Per-CPU gathering pools avoid contention on a global queue.

- Entropy is occasionally consolidated into global pool -- as soon as
it's ready, if we've never reached full entropy, and with a rate
limit afterward. Operators can force consolidation now by running
sysctl -w kern.entropy.consolidate=1.

- rndsink(9) API has been replaced by an epoch counter which changes
whenever entropy is consolidated into the global pool.
. Usage: Cache entropy_epoch() when you seed. If entropy_epoch()
has changed when you're about to use whatever you seeded, reseed.
. Epoch is never zero, so initialize cache to 0 if you want to reseed
on first use.
. Epoch is -1 iff we have never reached full entropy -- in other
words, the old rnd_initial_entropy is (entropy_epoch() != -1) --
but it is better if you check for changes rather than for -1, so
that if the system estimated its own entropy incorrectly, entropy
consolidation has the opportunity to prevent future compromise.

- Sysctls and event counters provide operator visibility into what's
happening:
. kern.entropy.needed - bits of entropy short of full entropy
. kern.entropy.pending - bits known to be pending in per-CPU pools,
can be consolidated with sysctl -w kern.entropy.consolidate=1
. kern.entropy.epoch - number of times consolidation has happened,
never 0, and -1 iff we have never reached full entropy

CPRNG_STRONG

- A cprng_strong instance is now a collection of per-CPU NIST
Hash_DRBGs. There are only two in the system: user_cprng for
/dev/urandom and sysctl kern.?random, and kern_cprng for kernel
users which may need to operate in interrupt context up to IPL_VM.

(Calling cprng_strong in interrupt context does not strike me as a
particularly good idea, so I added an event counter to see whether
anything actually does.)

- Event counters provide operator visibility into when reseeding
happens.

INTEL RDRAND/RDSEED, VIA C3 RNG (CPU_RNG)

- Unwired for now; will be rewired in a subsequent commit.


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.523 26-Apr-2020 thorpej

Add a NetBSD native futex implementation, mostly written by riastradh@.
Map the COMPAT_LINUX futex calls to the native ones.


Revision tags: bouyer-xenpvh-base2 phil-wifi-20200421 bouyer-xenpvh-base1 phil-wifi-20200411 bouyer-xenpvh-base is-mlppp-base phil-wifi-20200406 ad-namecache-base3
# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.522 24-Feb-2020 jdolecek

move config_init_mi() call before vfsinit(), which can trigger loading
of VFS modules

fixes crash with LOCKDEBUG due to uninitialized mutex when zfs
module is loaded in boot, because zfs's spa_init() calls config_mountroot()
which now requires the config init having been done


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.521 18-Feb-2020 chs

remove the aiodoned thread. I originally added this to provide a thread context
for doing page cache iodone work, but since then biodone() has changed to
hand off all iodone work to a softint thread, so we no longer need the
special-purpose aiodoned thread.


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.520 15-Feb-2020 ad

- Move the LW_RUNNING flag back into l_pflag: updating l_flag without lock
in softint_dispatch() is risky. May help with the "softint screwup"
panic.

- Correct the memory barriers around zombies switching into oblivion.


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RELEASE netbsd-9-0-RC2 netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.519 28-Jan-2020 ad

Call radix_tree_init() earlier, so more stuff can make use of radixtree.


Revision tags: ad-namecache-base2 ad-namecache-base1
# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

branches: 1.517.2;
- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.518 08-Jan-2020 ad

Hopefully fix some problems seen with MP support on non-x86, in particular
where curcpu() is defined as curlwp->l_cpu:

- mi_switch(): undo the ~2007ish optimisation to unlock curlwp before
calling cpu_switchto(). It's not safe to let other actors mess with the
LWP (in particular l->l_cpu) while it's still context switching. This
removes l->l_ctxswtch.

- Move the LP_RUNNING flag into l->l_flag and rename to LW_RUNNING since
it's now covered by the LWP's lock.

- Ditch lwp_exit_switchaway() and just call mi_switch() instead. Everything
is in cache anyway so it wasn't buying much by trying to avoid saving old
state. This means cpu_switchto() will never be called with prevlwp ==
NULL.

- Remove some KERNEL_LOCK handling which hasn't been needed for years.


Revision tags: ad-namecache-base
# 1.517 02-Jan-2020 thorpej

- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.517 02-Jan-2020 thorpej

- Eliminate the global "boottime" variable, which was being accessed
without any synchronization against changes by e.g. clock_settime().
- Replace with new getbinboottime() / getnanoboottime() / getmicroboottime()
functions (naming mirrors that of other time access functions in kern_tc.c).
It returns the (maybe-converted) value of timebasebin, which also tracks
our estimate of when the system was booted (i.e. the legacy "boottime" was
redundant).

XXX There needs to be a lockless synchronization mechanism for reading
timebasebin, but this is a problem in kern_tc.c that pre-existed these
"boottime" changes. At least now the problem is centralized in one location.


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.516 01-Jan-2020 thorpej

- Introduce a new global kernel variable "shutting_down" to indicate that
the system is shutting down or rebooting.
- Set this global in a new function called kern_reboot(), which is currently
just a basic wrapper around cpu_reboot().
- Call kern_reboot() instead of cpu_reboot() almost everywhere; a few
places remain where it's still called directly, but those are in early
pre-main() machdep locations.

Eventually, all of the various cpu_reboot() functions should be re-factored
and common functionality moved to kern_reboot(), but that's for another day.


# 1.515 01-Jan-2020 thorpej

First steps towards properly serializing access to the TOD clock.
- Add a mutex around the TODR, and provide lock/unlock/lock-owned
functions to manipulate it.
- Rename inittodr() to todr_set_systime() and resettodr() to
todr_save_systime() to better reflect what they do. These functions
are intended to be called with the TODR lock held, which will allow
for a pattern like:
-> todr_lock()
-> todr_save_systime()
-> [do machine-dependent stuff to sleep/suspend]
-> [magically awaken]
-> todr_set_systime(...)
-> todr_unlock()
- Provide historically-named wrappers inittodr() and resettodr() that
do the dance of acquiring / releasing the lock around the actual
substance.

NOTE: resettodr()'s use of the TODR lock is currently disabled (and
todr_save_systime() does not assert it's held) until such time as
issues around shutdown / reboot under duress can be addressed.


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.514 31-Dec-2019 ad

Rename uvm_free() -> uvm_availmem().


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.513 27-Dec-2019 ad

Redo the page allocator to perform better, especially on multi-core and
multi-socket systems. Proposed on tech-kern. While here:

- add rudimentary NUMA support - needs more work.
- remove now unused "listq" from vm_page.


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.512 22-Dec-2019 ad

Fix integer overflow when printing available memory size (resulting from
a cast lost during merges).

Reported-by: syzbot+f02ca5f83ac7196b8afd@syzkaller.appspotmail.com


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.511 21-Dec-2019 ad

uvmexp.free -> uvm_free()


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.510 14-Dec-2019 ad

Include radixtree in the kernel.


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.509 12-Dec-2019 pgoyette

Eliminate per-hook duplication of common code as suggested by
(and with major contributions from) riastradh@

Welcome to 9.99.23


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.508 02-Dec-2019 ad

Take the basic CPU topology information we already collect, and use it
to make circular lists of CPU siblings in the same core, and in the
same package. Nothing fancy, just enough to have a bit of fun in the
scheduler trying out different tactics.


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.507 01-Dec-2019 ad

Init kern_runq and kern_synch before booting secondary CPUs.


Revision tags: phil-wifi-20191119
# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-0-RC1 netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.506 03-Oct-2019 kamil

Remove compile-time asserts checking whether intptr_t and void* are compat

The checks were requested by core@ as a prerequisite for kevent::udata type
switch from intptr_t to void*.


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.505 24-Sep-2019 kamil

Add a temporary ctassert checking whether void* and intptr_t are compatible


Revision tags: netbsd-9-base phil-wifi-20190609
# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

branches: 1.497.2;
Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.504 17-May-2019 ozaki-r

Implement an aggressive psref leak detector

It is yet another psref leak detector that enables to tell where a leak occurs
while a simpler version that is already committed just tells an occurrence of a
leak.

Investigating of psref leaks is hard because once a leak occurs a percpu list of
psref that tracks references can be corrupted. A reference to a tracking object
is memorized in the list via an intermediate object (struct psref) that is
normally allocated on a stack of a thread. Thus, the intermediate object can be
overwritten on a leak resulting in corruption of the list.

The tracker makes a shadow entry to an intermediate object and stores some hints
into it (currently it's a caller address of psref_acquire). We can detect a
leak by checking the entries on certain points where any references should be
released such as the return point of syscalls and the end of each softint
handler.

The feature is expensive and enabled only if the kernel is built with
PSREF_DEBUG.

Proposed on tech-kern


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


Revision tags: isaki-audio2-base
# 1.503 20-Feb-2019 hannken

Attach "mnt_transinfo" to "dead_rootmount" so every mount has a
valid "mnt_transinfo" and remove now unneeded flag IMNT_HAS_TRANS.

Run fstrans_start()/fstrans_done() on dead_rootmount if FSTRANS_DEAD_ENABLED.
Should become the default for DIAGNOSTIC in the future.


Revision tags: pgoyette-compat-20190127
# 1.502 23-Jan-2019 kamil

Change the place of initproc initialization

The initproc variable cannot be initialized in start_init as there
is a race between vfs_mountroot and start_init.

PR kern/53817 by Andreas Gustafsson


Revision tags: pgoyette-compat-20190118
# 1.501 26-Dec-2018 thorpej

Rather than performing lazy initialization, statically initialize early
in the respective kernel startup routines.


Revision tags: pgoyette-compat-1226 pgoyette-compat-1126
# 1.500 30-Oct-2018 kre

Correct the 6 second offset issue between the time reported by
dmesg -T and the actual time a message was produced, noted on
current-users by Geoff Wing (Oct 27, 2018).

The size of the offset would depend upon architecture, and processor,
but was the delay from starting the clocks to initialising the time
of day (after mounting root, in case that is needed).

Change the kernel to set boottime to be the time at which the
clocks were started, rather than the time at which it is init'd
(by subtracting the interval between).

Correct dmesg to properly compute the ToD based upon the
boottime (which is a timespec, not a timeval, and has been
since Jan 2009) and the time logged in the message.

Note that this can (rarely) be 1 second earlier than date reports.
This occurs when the time when the message was logged was actually
in the next second, but the timecounters have not yet processed
the tick, and so the time of the last tick, near the end of the
previous second, is reported instead. Since times are always
truncated, rather than rounded, it is occasionally possible to
observe that disparity (if you try hard enough).

IOW: sys/kern/subr_prf.c:addtstamp() uses getnanouptime() rather
than nanouptime().

Note in dmesg(8) that -T conversions are gibberish other than
when the message comes from current the running kernel. (It
could be fixed when -M is used, for messages generated by the
kernel whose corpse is being observed. But hasn't been...)


# 1.499 26-Oct-2018 martin

Only print the "no console" warning when booting verbose or debug.
It is a normal condition in many setups and has no consequences for
the user, so do not scare them.


Revision tags: pgoyette-compat-1020 pgoyette-compat-0930 pgoyette-compat-0906 pgoyette-compat-0728
# 1.498 03-Jul-2018 ozaki-r

Fix net.inet6.ip6.ifq node doesn't exist

The node (and child nodes) is initialized in sysctl_net_pktq_setup, but the call
of sysctl_net_pktq_setup is skipped unexpectedly.

sysctl_net_pktq_setup is skipped if in6_present is false that indicates the
netinet6 component isn't loaded on rump kernels. However the flag is
accidentally always false because the flag is turned on in in6_dom_init that is
called after if_sysctl_setup on both normal and rump kernels.

Fix the issue by moving if_sysctl_setup after in6_dom_init (domaininit on normal
kernels). This fix is ad-hoc but good enough for netbsd-8. We should refine
the initialization order of network components in the future.

Pointed out by hikaru@


Revision tags: phil-wifi-base pgoyette-compat-0625 pgoyette-compat-0521 pgoyette-compat-0502 pgoyette-compat-0422
# 1.497 16-Apr-2018 kamil

Remove the rnewprocp argument from fork1(9)

It's now unused and it can cause use-after-free scenarios as noted by
<Mateusz Guzik>.

Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


# 1.496 16-Apr-2018 kamil

Set initproc inside start_init()

This allows us to stop using the rnewprocp argument in fork1(9).

The rnewprocp argument will be removed soon from the API, as it can cause
use-after-free scenarios.

No functional change intended.

Noted by <Mateusz Guzik>
Reference: http://mail-index.netbsd.org/tech-kern/2017/09/08/msg022267.html

Sponsored by <The NetBSD Foundation>


Revision tags: pgoyette-compat-0415 pgoyette-compat-0407 pgoyette-compat-0330 pgoyette-compat-0322 pgoyette-compat-0315 pgoyette-compat-base
# 1.495 04-Feb-2018 maxv

branches: 1.495.2;
Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: netbsd-8-0-RC2 netbsd-8-0-RC1 matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4; 1.490.6;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.495 04-Feb-2018 maxv

Add a proper defflag for GPROF, and include opt_gprof.h, otherwise we're
not gonna go very far.


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.494 26-Dec-2017 msaitoh

Make cold __read_mostly like mp_online.


# 1.493 15-Dec-2017 chs

add some assertions to verify that CPU_INFO_FOREACH() works right
early in the boot process. this detects existing bugs on some platforms.


Revision tags: tls-maxphys-base-20171202
# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: matt-nb8-mediatek-base nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.492 27-Oct-2017 joerg

Revert printf return value change.


# 1.491 27-Oct-2017 utkarsh009

[syzkaller] Cast all the printf's to (void *)
> as a result of new printf(9) declaration.


Revision tags: nick-nhusb-base-20170825 perseant-stdc-iso10646-base netbsd-8-base prg-localcount2-base3 prg-localcount2-base2 prg-localcount2-base1 prg-localcount2-base pgoyette-localcount-20170426 bouyer-socketcan-base1 jdolecek-ncq-base pgoyette-localcount-20170320 nick-nhusb-base-20170204
# 1.490 16-Jan-2017 ryo

branches: 1.490.4;
Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

branches: 1.489.2;
Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.490 16-Jan-2017 ryo

Make pfil(9) MP-safe (applying psref(9))


Revision tags: bouyer-socketcan-base pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


Revision tags: pgoyette-localcount-20170107
# 1.489 05-Jan-2017 pgoyette

Actually initialize the sysctl stuff for kernhist! Missed this file
in earlier commits.


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision


# 1.488 26-Dec-2016 pgoyette

Add a BIOHIST option. As mentioned on tech-kern.


Revision tags: nick-nhusb-base-20161204
# 1.487 16-Nov-2016 pgoyette

Initialize the bufq code right before we're ready to load the strategy
modules.


# 1.486 16-Nov-2016 pgoyette

Define a new module class for the bufq_strategy modules. These need to
be loaded and intialized before autoconfigure runs, since some devices
(like disks and floppy drives) want to call bufq_alloc().


# 1.485 16-Nov-2016 pgoyette

Modularize the various bufq strategies


Revision tags: pgoyette-localcount-20161104
# 1.484 02-Nov-2016 pgoyette

* Split sys/kern/sys_process.c into three parts:
1 - ptrace(2) syscall for native emulation
2 - common ptrace(2) syscall code (shared with compat_netbsd32)
3 - support routines that are shared with PROCFS and/or KTRACE

* Add module glue for #1 and #2. Both modules will be built-in to the
kernel if "options PTRACE" is included in the config file (this is
the default, defined in sys/conf/std).

* Mark the ptrace(2) syscall as modular in syscalls.master (generated
files will be committed shortly).

* Conditionalize all remaining portions of PTRACE code on a new kernel
option PTRACE_HOOKS.

XXX Instead of PROCFS depending on 'options PTRACE', we should probably
just add a procfs attribute to the sys/kern/sys_process.c file's
entry in files.kern, and add PROCFS to the "#if defineds" for
process_domem(). It's really confusing to have two different ways
of requiring this file.


Revision tags: nick-nhusb-base-20161004
# 1.483 17-Sep-2016 maxv

This is just a temporary stack that holds fake arguments, and that gets
remapped as RW in sys_execve. Still, in this small window, it does not need
to be executable.


Revision tags: localcount-20160914 pgoyette-localcount-20160806 pgoyette-localcount-20160726 pgoyette-localcount-base nick-nhusb-base-20160907
# 1.482 07-Jul-2016 msaitoh

branches: 1.482.2;
KNF. Remove extra spaces. No functional change.


# 1.481 04-Jun-2016 palle

Added missing "it" to comment in start_init()


Revision tags: nick-nhusb-base-20160529
# 1.480 22-May-2016 christos

reduce #ifdef mess caused by PaX


Revision tags: nick-nhusb-base-20160422
# 1.479 28-Mar-2016 macallan

fix this properly.
uap is supposed to hold init's argv[], so it's 3 * sizeof(char *), the bug
was to copyout(..., sizeof(args)) which is an array of syscallargs, not argv
*


# 1.478 28-Mar-2016 macallan

do not assume that syscallarg(const char *) and (char *) are the same size
first step to make n32 kernels run binaries again


Revision tags: nick-nhusb-base-20160319 nick-nhusb-base-20151226
# 1.477 08-Dec-2015 christos

Replace DIOCGPART -> DIOCGPARTINFO which returns the data needed instead of
pointers.


# 1.476 07-Dec-2015 pgoyette

Modularize drvctl(4)


# 1.475 26-Nov-2015 ozaki-r

Fix build dependency of if_llatbl.c

if_llatbl.c is required if inet or inet6 is enabled. Depending on ether
doesn't suit for NDP case.


# 1.474 20-Nov-2015 christos

get rid of the suword {m,j}umbo and check return of copyout.


# 1.473 06-Nov-2015 pgoyette

In sysv_sem.c, defer establishment of exithook so we can initialize the
module code from module_init() rather than waiting until after calling
exec_init(). Use a RUN_ONCE routine at entry to each sys_sem* syscall
to establish the exithook, and no longer KASSERT that the hook has
been set before removing it. (A manually loaded module can be unloaded
before any syscalls have been invoked.)

Remove the conditional calls to the various xxx_init() routines from
init_main.c - we now rely on module_init() to handle initialization.

Let each sub-component's xxx_init() routine handle its own sysctl
sub-tree initialization; this removes another set of #ifdef ugliness.

Tested both built-in and loadable versions and verified that atf
test kernel/t_sysv passes.


# 1.472 29-Oct-2015 mrg

introduce a new way of handling SYSCALL_DEBUG messages -- send them to
a kernel history, settable via the SCDEBUG_KERNHIST flag.

this requires a fairly significantly different set of messages than the
normal debug as histories are restricted:
- each message can take one literal format string and upto 4
arguments
- the arguments can not be strings if you want vmstat -u to
work (this could be fixed, and i might, as it would be nice
if we could print syscall names as well as numbers.)

introduce SCDEBUG_DEFAULT that is settable in the kernel config.

fix a problem in kernhist_dump_histories() where it would crash when a
history with no allocated entries was found.

extend kernhist_dumpmask() to handle the usbhist and scdebughist.


# 1.471 17-Oct-2015 jmcneill

initialize MODULE_CLASS_DRIVER modules before the drivers themselves are loaded during autoconfiguration


Revision tags: nick-nhusb-base-20150921
# 1.470 14-Sep-2015 uebayasi

Handle splash image generation better.


# 1.469 31-Aug-2015 ozaki-r

Fix building kernels w/o ether


# 1.468 31-Aug-2015 ozaki-r

Hook up lltable/llentry with the kernel (and rumpkernel)

It is built and initialized on bootup, but there is no user for now.

Most codes in in.c are imported from FreeBSD as well as lltable/llentry.


Revision tags: nick-nhusb-base-20150606
# 1.467 06-May-2015 hannken

Remove miscfs/syncfs and

- move the syncer into kern/vfs_subr.c.

- change the syncer to process the mountlist and VFS_SYNC as appropriate.

- use an API for mount points similiar to the API for vnodes:
- vfs_syncer_add_to_worklist(struct mount *mp) to add
- vfs_syncer_remove_from_worklist(struct mount *mp) to remove a mount.

No objections on tech-kern@


# 1.466 30-Apr-2015 nat

Remove unintended whitespace.


# 1.465 30-Apr-2015 nat

Added a new option for embedding a splash screen into kernel.
Add: options SPLASHSCREEN
makeoptions SPLASHSCREEN_IMAGE="path/to/image"
to your config file. So far it will work on amd64 and RPI/RPI2.

This commit was with ideas, help, and OK from jmcneill@.


# 1.464 27-Apr-2015 pgoyette

Replace a home-grown run-once implementation with the real RUN_ONCE()


# 1.463 23-Apr-2015 pgoyette

Update initialization of sysmon and its components. These are now handled as part of module initialization, and do not require manual invocation. sysmon_taskq is special, since it is potentially used by several non-module users who may need the facility before modules are fully ready.


Revision tags: nick-nhusb-base-20150406
# 1.462 06-Mar-2015 mrg

wait for config_mountroot threads to complete before we tell init it
can start up. this solves the problem where a console device needs
mountroot to complete attaching, and must create wsdisplay0 before
init tries to open /dev/console. fixes PR#49709.

XXX: pullup-7


Revision tags: nick-nhusb-base
# 1.461 27-Nov-2014 uebayasi

branches: 1.461.2;
Yield the main thread only after exiting critical section.


# 1.460 04-Oct-2014 riastradh

Make uuidgen(2) generate v4 (random) uuids.

Rip out all the needless MAC address and date/time leakage. No more
uuid_init necessary, nor contention over a global uuid state.

While here, simplify uuid_snprintf and fix a strict aliasing
violation.


# 1.459 14-Aug-2014 riastradh

Defer cprng_fast_init until CPUs are detected.


Revision tags: netbsd-7-base tls-maxphys-base
# 1.458 10-Aug-2014 tls

branches: 1.458.2;
Merge tls-earlyentropy branch into HEAD.


Revision tags: tls-earlyentropy-base
# 1.457 07-Jul-2014 riastradh

Initialize ubchist earlier.


# 1.456 19-May-2014 rmind

Move ipi_sysinit() after configure2(); we want secondary CPUs attached.
Might revisit if the there will be a need to use this interface earlier.


# 1.455 19-May-2014 rmind

Implement MI IPI interface with cross-call support.


Revision tags: yamt-pagecache-base9 riastradh-xf86-video-intel-2-7-1-pre-2-21-15 riastradh-drm2-base3 rmind-smpnet-nbase rmind-smpnet-base
# 1.454 02-Oct-2013 apb

branches: 1.454.2;
Add "/rescue/init" to the end of the initpaths list, which
now contains: { "/sbin/init", "/sbin/oinit", "/sbin/init.bak",
"/rescue/init", NULL }.

XXX: The kernel's use of initpaths is not documented.


# 1.453 28-Aug-2013 riastradh

Tighten initialization of rnd softints.

- Do rnd_init_softint as early as possible in main, after configure2,
and before networking is initialized.

- Initialize the rnd_wakeup softint in rnd_init_softint, not lazily in
rnd_schedule_wakeup.

ok tls


# 1.452 27-Aug-2013 riastradh

Back out the recent rnd stop-gap/stop-gap/stop-gap measures.

This reverts

sys/dev/rnd_private.h -> r1.1
sys/kern/init_main.c -> r1.450
sys/kern/kern_rndq.c -> r1.14
sys/kern/kern_rndsink.c -> r1.2

Parts of these changes will be added back, and the rndsource
callbacks will be fixed to avoid the lock recursion bug that
motivated the stop-gaps in the first place.

ok tls


# 1.451 25-Aug-2013 tls

Attempt to resolve locking issues at kernel startup on platforms with
hardware RNGs using the polling mode of operation:

1) Initialize the rng subsystem soft interrupts as early in kernel startup
as seems safe (we have no MI guarantee that softints are working at all
until configure2() returns, AFAICT).

This should have the rnd subsystem able to process events via softint
before the network subsystem (a notorious early user of entropy) starts.

2) Remove the shortcut calls to rnd_process_events() from
rnd_schedule_process(), with the result that until the softint is installed
rnd_process_events() is a NOP.

3) Directly call rnd_process_events() in rnd_extract_data(),
rnd_maybe_extract(), and rnd_init_softint(). This should suck up any
samples actually collected as early as possible.


Revision tags: riastradh-drm2-base2 riastradh-drm2-base1 riastradh-drm2-base
# 1.450 20-Jun-2013 christos

branches: 1.450.2;
Initialize the rnd softint explicitly via a function late in main. Avoids
LOCKDEBUG panic since softint_establish() was called via wdcintr -> wddone
from an interrupt context and tried to acquire a non-spin mutex.


# 1.449 05-Jun-2013 christos

IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.


Revision tags: agc-symver-base
# 1.448 18-Mar-2013 para

calculate vnode cache size based on the resource it gets allocated from
this stops setting kern.maxvnodes to high so it exhausts available space in kmem

http://mail-index.netbsd.org/tech-kern/2013/03/08/msg015095.html


# 1.447 21-Feb-2013 pgoyette

Move boottime50 and its associated sysctl into the compat module. As
noted on tech-kern. Should fix PR/47579.

OK christos@

Will request pull-up to 6.0 in a few days.


# 1.446 09-Feb-2013 christos

printflike maintenance.


Revision tags: yamt-pagecache-base8 yamt-pagecache-base7 yamt-pagecache-base6
# 1.445 29-Jul-2012 mlelstv

branches: 1.445.2;
Do not call setroot() from MD code and from MI code, which has
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.

No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().

Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().

Make cpu_rootconf(9) describe the calling order.


# 1.444 14-Jun-2012 martin

Do not try to find the wedge we booted from if opendisk(booted_device)
failed.


# 1.443 10-Jun-2012 mlelstv

Make detection of root on wedges (dk(4)) machine independent. Remove
MD code for x86, xen, sparc64.


Revision tags: jmcneill-usbmp-base10 yamt-pagecache-base5 jmcneill-usbmp-base9 yamt-pagecache-base4 jmcneill-usbmp-base8 jmcneill-usbmp-base7 jmcneill-usbmp-base6 jmcneill-usbmp-base5 jmcneill-usbmp-base4 jmcneill-usbmp-base3
# 1.442 19-Feb-2012 rmind

Remove COMPAT_SA / KERN_SA. Welcome to 6.99.3!
Approved by core@.


Revision tags: jmcneill-usbmp-base2 netbsd-6-base
# 1.441 02-Feb-2012 tls

branches: 1.441.2;
Entropy-pool implementation move and cleanup.

1) Move core entropy-pool code and source/sink/sample management code
to sys/kern from sys/dev.

2) Remove use of NRND as test for presence of entropy-pool code throughout
source tree.

3) Remove use of RND_ENABLED in device drivers as microoptimization to
avoid expensive operations on disabled entropy sources; make the
rnd_add calls do this directly so all callers benefit.

4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might
have lead to slight entropy overestimation for some sources.

5) Add new source types for environmental sensors, power sensors, VM
system events, and skew between clocks, with a sample implementation
for each.

ok releng to go in before the branch due to the difficulty of later
pullup (widespread #ifdef removal and moved files). Tested with release
builds on amd64 and evbarm and live testing on amd64.


# 1.440 29-Jan-2012 rmind

- Add mi_cpu_init() and initialise cpu_lock and kcpuset_attached/running there.
- Add kcpuset_running which gets set in idle_loop().
- Use kcpuset_running in pserialize_perform().


# 1.439 24-Jan-2012 christos

Use and define ALIGN() ALIGN_POINTER() and STACK_ALIGN() consistently,
and avoid definining them in 10 different places if not needed.


# 1.438 04-Dec-2011 jym

Implement the register/deregister/evaluation API for secmodel(9). It
allows registration of callbacks that can be used later for
cross-secmodel "safe" communication.

When a secmodel wishes to know a property maintained by another
secmodel, it has to submit a request to it so the other secmodel can
proceed to evaluating the request. This is done through the
secmodel_eval(9) call; example:

bool isroot;
error = secmodel_eval("org.netbsd.secmodel.suser", "is-root",
cred, &isroot);
if (error == 0 && !isroot)
result = KAUTH_RESULT_DENY;

This one asks the suser module if the credentials are assumed to be root
when evaluated by suser module. If the module is present, it will
respond. If absent, the call will return an error.

Args and command are arbitrarily defined; it's up to the secmodel(9) to
document what it expects.

Typical example is securelevel testing: when someone wants to know
whether securelevel is raised above a certain level or not, the caller
has to request this property to the secmodel_securelevel(9) module.
Given that securelevel module may be absent from system's context (thus
making access to the global "securelevel" variable impossible or
unsafe), this API can cope with this absence and return an error.

We are using secmodel_eval(9) to implement a secmodel_extensions(9)
module, which plugs with the bsd44, suser and securelevel secmodels
to provide the logic behind curtain, usermount and user_set_cpu_affinity
modes, without adding hooks to traditional secmodels. This solves a
real issue with the current secmodel(9) code, as usermount or
user_set_cpu_affinity are not really tied to secmodel_suser(9).

The secmodel_eval(9) is also used to restrict security.models settings
when securelevel is above 0, through the "is-securelevel-above"
evaluation:
- curtain can be enabled any time, but cannot be disabled if
securelevel is above 0.
- usermount/user_set_cpu_affinity can be disabled any time, but cannot
be enabled if securelevel is above 0.

Regarding sysctl(7) entries:
curtain and usermount are now found under security.models.extensions
tree. The security.curtain and vfs.generic.usermount are still
accessible for backwards compat.

Documentation is incoming, I am proof-reading my writings.

Written by elad@, reviewed and tested (anita test + interact for rights
tests) by me. ok elad@.

See also
http://mail-index.netbsd.org/tech-security/2011/11/29/msg000422.html

XXX might consider va0 mapping too.

XXX Having a secmodel(9) specific printf (like aprint_*) for reporting
secmodel(9) errors might be a good idea, but I am not sure on how
to design such a function right now.


Revision tags: jmcneill-usbmp-pre-base2 jmcneill-usbmp-base
# 1.437 19-Nov-2011 tls

branches: 1.437.2;
First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>. This change includes
the following:

An initial cleanup and minor reorganization of the entropy pool
code in sys/dev/rnd.c and sys/dev/rndpool.c. Several bugs are
fixed. Some effort is made to accumulate entropy more quickly at
boot time.

A generic interface, "rndsink", is added, for stream generators to
request that they be re-keyed with good quality entropy from the pool
as soon as it is available.

The arc4random()/arc4randbytes() implementation in libkern is
adjusted to use the rndsink interface for rekeying, which helps
address the problem of low-quality keys at boot time.

An implementation of the FIPS 140-2 statistical tests for random
number generator quality is provided (libkern/rngtest.c). This
is based on Greg Rose's implementation from Qualcomm.

A new random stream generator, nist_ctr_drbg, is provided. It is
based on an implementation of the NIST SP800-90 CTR_DRBG by
Henric Jungheim. This generator users AES in a modified counter
mode to generate a backtracking-resistant random stream.

An abstraction layer, "cprng", is provided for in-kernel consumers
of randomness. The arc4random/arc4randbytes API is deprecated for
in-kernel use. It is replaced by "cprng_strong". The current
cprng_fast implementation wraps the existing arc4random
implementation. The current cprng_strong implementation wraps the
new CTR_DRBG implementation. Both interfaces are rekeyed from
the entropy pool automatically at intervals justifiable from best
current cryptographic practice.

In some quick tests, cprng_fast() is about the same speed as
the old arc4randbytes(), and cprng_strong() is about 20% faster
than rnd_extract_data(). Performance is expected to improve.

The AES code in src/crypto/rijndael is no longer an optional
kernel component, as it is required by cprng_strong, which is
not an optional kernel component.

The entropy pool output is subjected to the rngtest tests at
startup time; if it fails, the system will reboot. There is
approximately a 3/10000 chance of a false positive from these
tests. Entropy pool _input_ from hardware random numbers is
subjected to the rngtest tests at attach time, as well as the
FIPS continuous-output test, to detect bad or stuck hardware
RNGs; if any are detected, they are detached, but the system
continues to run.

A problem with rndctl(8) is fixed -- datastructures with
pointers in arrays are no longer passed to userspace (this
was not a security problem, but rather a major issue for
compat32). A new kernel will require a new rndctl.

The sysctl kern.arandom() and kern.urandom() nodes are hooked
up to the new generators, but the /dev/*random pseudodevices
are not, yet.

Manual pages for the new kernel interfaces are forthcoming.


Revision tags: jmcneill-audiomp3-base yamt-pagecache-base3 yamt-pagecache-base2 yamt-pagecache-base
# 1.436 28-Sep-2011 jruoho

branches: 1.436.2;
Initialize cpufreq(9) normally from main().


# 1.435 07-Aug-2011 rmind

Add kcpuset(9) - a reworked dynamic CPU set implementation for kernel.
Suitable for use during the early boot. MD and other implementations
should be replaced with this interface.

Discussed on: tech-kern@


# 1.434 30-Jul-2011 christos

Add an implementation of passive serialization as described in expired
US patent 4809168. This is a reader / writer synchronization mechanism,
designed for lock-less read operations.


# 1.433 02-Jul-2011 bouyer

Fix kern/45093 as discussed on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2011/06/17/msg010734.html

The cause of the problem is that the so_pendfree is processed with
the softnet_lock held at one point, and processing the list
calls sodoloanfree() which may kpause(). As the thread sleeps with
softnet_lock held, it ultimately cause a deadlock (see the PR or tech-kern
thread for details).
Although it should be possible to call sodopendfree() after releasing
the socket lock, it's not so easy to know where he socket lock is held and
where it's not, so we may hit the issue again later.
Add a kernel thread to handle the so_pendfree list, and wake up this
thread when adding mbufs to this list. Get rid of the various sodopendfree()
calls, hopefully fixing definitively the problem.


# 1.432 12-Jun-2011 rmind

Welcome to 5.99.53! Merge rmind-uvmplock branch:

- Reorganize locking in UVM and provide extra serialisation for pmap(9).
New lock order: [vmpage-owner-lock] -> pmap-lock.

- Simplify locking in some pmap(9) modules by removing P->V locking.

- Use lock object on vmobjlock (and thus vnode_t::v_interlock) to share
the locks amongst UVM objects where necessary (tmpfs, layerfs, unionfs).

- Rewrite and optimise x86 TLB shootdown code, make it simpler and cleaner.
Add TLBSTATS option for x86 to collect statistics about TLB shootdowns.

- Unify /dev/mem et al in MI code and provide required locking (removes
kernel-lock on some ports). Also, avoid cache-aliasing issues.

Thanks to Andrew Doran and Joerg Sonnenberger, as their initial patches
formed the core changes of this branch.


Revision tags: rmind-uvmplock-nbase cherry-xenmp-base rmind-uvmplock-base
# 1.431 31-May-2011 dyoung

branches: 1.431.2;
Don't use the C preprocessor to configure USERCONF. Instead, either do
or do not link in subr_userconf.c and x86_userconf.c.

Provide no-op stubs for userconf_bootinfo(), userconf_init(), and
userconf_prompt().

Delete all occurrences of #include "opt_userconf.h" as well as USERCONF
and __HAVE_USERCONF_BOOTINFO #ifdef'age.


# 1.430 26-May-2011 uebayasi

Support userconf(4) command in boot(8)/boot.cfg(5) on i386/amd64.

From jmmv@, no objections seen in the proposed thread:

http://mail-index.netbsd.org/tech-kern/2009/01/22/msg004081.html


# 1.429 19-May-2011 rmind

Re-implement kthread_join(9), so that it actually works (hi haad@).


# 1.428 14-Apr-2011 yamt

comment


Revision tags: bouyer-quota2-nbase bouyer-quota2-base
# 1.427 28-Jan-2011 pooka

Move sysctl routines from init_sysctl.c to kern_descrip.c (for
descriptors) and kern_proc.c (for processes). This makes them
usable in a rump kernel, in case somebody was wondering.


# 1.426 18-Jan-2011 matt

branches: 1.426.2;
Move up evcnt_init to before cpu_startup()


# 1.425 17-Jan-2011 uebayasi

Include internal definitions (uvm/uvm.h) only where necessary.


Revision tags: jruoho-x86intr-base matt-mips64-premerge-20101231
# 1.424 16-Dec-2010 eeh

branches: 1.424.2;
ubc_init needs to run while we're still cold or uvmhist breaks.


Revision tags: uebayasi-xip-base4 uebayasi-xip-base3 yamt-nfs-mp-base11
# 1.423 21-Aug-2010 pgoyette

Define a set of new kernel locking primitives to implement the recursive
kernconfig_mutex. Update module subsystem to use this mutex rather than
its own internal (non-recursive) mutex. Make module_autoload() do its
own locking to be consistent with the rest of the module_xxx() calls.
Update module(9) man page appropriately.

As discussed on tech-kern over the last few weeks.

Welcome to NetBSD 5.99.39 !


Revision tags: uebayasi-xip-base2 yamt-nfs-mp-base10
# 1.422 26-Jun-2010 pgoyette

1. Add an allocator for 'struct module *' and use it instead of local
allocations.

2. Add a new member mod_flags to the 'struct module *' and define
MODFLG_MUST_FORCE. If this flag is set and the entry is on the list
of builtins, it means that the module has been explicitly unloaded
and any re-loads will require the MODCTL_LOAD_FORCE flag. Provide a
module_require_force() method to set this flag; once set, it should
never be unset.

3. Rename original module_init2() to module_start_unload_thread() to be
more descriptive of what it does.

4. Add a new module_builtin_require_force() routine that sets the
MODFLG_MUST_FORCE flag for any module that has not yet successfully
been initialized. Call it after module_init_class(MODULE_CLASS_ANY)
to disable remaining built-in modules.

This makes built-in versions of the xxxVERBOSE modules work once more,
resolving breakage reported by jruoho@ and njoly@.

Discussed on tech-kern, and comments and suggestions implemented. No
additional discussion for last week. Tested only on amd64 systems, but
there's nothing here that should be port- or architecture-specific (no
more specific than existing module implementation) so others should not
break.


# 1.421 25-Jun-2010 tsutsui

Add config_mountroot(9), which defers device configuration
after mountroot(), like config_interrupt(9) that defers
configuration after interrupts are enabled.
This will be used for devices that require firmware loaded
from the root file system by firmload(9) to complete device
initialization (getting MAC address etc).

No objection on tech-kern@:
http://mail-index.NetBSD.org/tech-kern/2010/06/18/msg008370.html
and will also fix PR kern/43125.


# 1.420 10-Jun-2010 pooka

lwp0 seems like an lwp instead of a process, so move bits related
to it from kern_proc.c to kern_lwp.c. This makes kern_proc
"scheduling-clean" and more easily usable in environments with a
non-integrated scheduler (like, to take a random example, rump).


Revision tags: uebayasi-xip-base1
# 1.419 21-Apr-2010 pooka

Reduce #ifdef spew by attaching wapbl as a module.
(no, it's still too ifdef-ridden to be able to actually do anything
useful and module-like like load into any kernel)


Revision tags: yamt-nfs-mp-base9 uebayasi-xip-base
# 1.418 05-Feb-2010 cegger

branches: 1.418.2; 1.418.4;
fix LOCKDEBUG panic 'uninitialized lock'.
seminit() calls exithook_establish(). exithook_establish() uses the exec_lock.
exec_lock is initialzed by exec_init(1).
Call exec_init(1) before seminit().


# 1.417 31-Jan-2010 pooka

uncommit part which wasn't supposed to get committed yet


# 1.416 31-Jan-2010 pooka

Pass root device as a parameter to domountroothook().


# 1.415 31-Jan-2010 hubertf

Replace more printfs with aprint_normal / aprint_verbose
Makes "boot -z" go mostly silent for me.


# 1.414 19-Jan-2010 pooka

Redefine bpf linkage through an always present op vector, i.e.
#if NBPFILTER is no longer required in the client. This change
doesn't yet add support for loading bpf as a module, since drivers
can register before bpf is attached. However, callers of bpf can
now be modularized.

Dynamically loadable bpf could probably be done fairly easily with
coordination from the stub driver and the real driver by registering
attachments in the stub before the real driver is loaded and doing
a handoff. ... and I'm not going to ponder the depths of unload
here.

Tested with i386/MONOLITHIC, modified MONOLITHIC without bpf and rump.


# 1.413 23-Dec-2009 elad

Including sysctl.h once is enough.


# 1.412 17-Dec-2009 rmind

Replace few USER_TO_UAREA/UAREA_TO_USER uses, reduce sys/user.h inclusions.


Revision tags: matt-premerge-20091211
# 1.411 27-Nov-2009 pooka

Move rootfs-related init from init_main() to vfs_mountroot().
Reduces code re-written in rump.


# 1.410 15-Nov-2009 elad

Include miscfs/specfs/specdev.h for spec_init().


# 1.409 14-Nov-2009 elad

- Move kauth_init() a little bit higher.

- Add spec_init() to authorize special device actions (and passthru too for
the time being). Move policy out of secmodel_suser.


# 1.408 03-Nov-2009 dyoung

Add a kernel configuration flag, SPLDEBUG, that activates a per-CPU log
of transitions to IPL_HIGH from lower IPLs. SPLDEBUG is only available
on i386 and Xen kernels, today.

'options SPLDEBUG' adds instrumentation to spllower() and splraise() as
well as routines to start/stop debugging and to record IPL transitions:
spldebug_start(), spldebug_stop(), spldebug_raise(), spldebug_lower().


Revision tags: jym-xensuspend-nbase
# 1.407 26-Oct-2009 rmind

Update comment about proc0_init().


# 1.406 06-Oct-2009 elad

Add a (weak aliased) machdep_init() as a place to do machdep initialization
that can't happen as early as the other init functions as called from
cpu_startup() -- for example, register kauth(9) listeners.

Put unprivileged policy in the x86 code; used by i386, amd64, and xen.


# 1.405 03-Oct-2009 elad

- Move sched_listener and co. from kern_synch.c to sys_sched.c, where it
really belongs (suggested by rmind@),

- Rename sched_init() to synch_init(), and introduce a new sched_init()
in sys_sched.c where we (a) initialize the sysctl node (no more
link-set) and (b) listen on the process scope with sched_listener.

Reviewed by and okay rmind@.


# 1.404 02-Oct-2009 elad

Move ptrace's security policy back to the subsystem itself.

Add a ptrace_init() so we have a place to register the listener; called
next to ktrinit().


# 1.403 02-Oct-2009 elad

First part of secmodel cleanup and other misc. changes:

- Separate the suser part of the bsd44 secmodel into its own secmodel
and directory, pending even more cleanups. For revision history
purposes, the original location of the files was

src/sys/secmodel/bsd44/secmodel_bsd44_suser.c
src/sys/secmodel/bsd44/suser.h

- Add a man-page for secmodel_suser(9) and update the one for
secmodel_bsd44(9).

- Add a "secmodel" module class and use it. Userland program and
documentation updated.

- Manage secmodel count (nsecmodels) through the module framework.
This eliminates the need for secmodel_{,de}register() calls in
secmodel code.

- Prepare for secmodel modularization by adding relevant module bits.
The secmodels don't allow auto unload. The bsd44 secmodel depends
on the suser and securelevel secmodels. The overlay secmodel depends
on the bsd44 secmodel. As the module class is only cosmetic, and to
prevent ambiguity, the bsd44 and overlay secmodels are prefixed with
"secmodel_".

- Adapt the overlay secmodel to recent changes (mainly vnode scope).

- Stop using link-sets for the sysctl node(s) creation.

- Keep sysctl variables under nodes of their relevant secmodels. In
other words, don't create duplicates for the suser/securelevel
secmodels under the bsd44 secmodel, as the latter is merely used
for "grouping".

- For the suser and securelevel secmodels, "advertise presence" in
relevant sysctl nodes (sysctl.security.models.{suser,securelevel}).

- Get rid of the LKM preprocessor stuff.

- As secmodels are now modules, there's no need for an explicit call
to secmodel_start(); it's handled by the module framework. That
said, the module framework was adjusted to properly load secmodels
early during system startup.

- Adapt rump to changes: Instead of using empty stubs for securelevel,
simply use the suser secmodel. Also replace secmodel_start() with a
call to secmodel_suser_start().

- 5.99.20.

Testing was done on i386 ("release" build). Spearated module_init()
changes were tested on sparc and sparc64 as well by martin@ (thanks!).

Mailing list reference:

http://mail-index.netbsd.org/tech-kern/2009/09/25/msg006135.html


# 1.402 29-Sep-2009 dyoung

#include "drvctl.h" for the NDRVCTL definition. Without the NDRVCTL
definition, drvctl_init() is not called, the drvctl_eventq is not
initialized, and the kernel will panic in devmon_insert() when a
device is detached.

Thanks to Jared McNeill for pointing out the panic.


# 1.401 21-Sep-2009 pooka

Split config_init() into config_init() and config_init_mi() to help
platforms which want to call config_init() very early in the boot.


# 1.400 16-Sep-2009 pooka

Replace a large number of link set based sysctl node creations with
calls from subsystem constructors. Benefits both future kernel
modules and rump.

no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL


Revision tags: yamt-nfs-mp-base8
# 1.399 13-Sep-2009 pooka

Wipe out the last vestiges of POOL_INIT with one swift stroke. In
most cases, use a proper constructor. For proplib, give a local
equivalent of POOL_INIT for the kernel object implementation. This
way the code structure can be preserved, and a local link set is
not hazardous anyway (unless proplib is split to several modules,
but that'll be the day).

tested by booting a kernel in qemu and compile-testing i386/ALL


# 1.398 03-Sep-2009 pooka

Move configure() and configure2() from subr_autoconf.c to init_main.c,
since they are only peripherially related to the autoconf subsystem
and more related to boot initialization. Also, apply _KERNEL_OPT
to autoconf where necessary.


# 1.397 02-Sep-2009 pooka

Initialize devsw (lock) early so that subsystems may play with it.


Revision tags: yamt-nfs-mp-base7
# 1.396 27-Jul-2009 mbalmer

Do not attach gpiosim(4) at root, but make it a pseudo device.
With help from Matthias Drochner, thanks!


# 1.395 25-Jul-2009 mbalmer

Allow gpiosim(4) to attach if configured in the kernel configuration.


Revision tags: jymxensuspend-base
# 1.394 19-Jul-2009 yamt

set LP_RUNNING when starting lwp0 and idle lwps.
add assertions.


# 1.393 19-Jul-2009 rmind

Make POSIX message queues a kernel module.


# 1.392 17-Jul-2009 ad

Don't send the quiet banner to the log, since the usual noise gets dumped
there anyway.


Revision tags: yamt-nfs-mp-base6
# 1.391 29-Jun-2009 dholland

Convert 67 namei call sites to use namei_simple, in these functions:

check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr

All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.

XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.


Revision tags: yamt-nfs-mp-base5
# 1.390 27-May-2009 pooka

Make domaininit() take an argument which determines if it should
add the special PF_ROUTE domain or not (if available).


Revision tags: yamt-nfs-mp-base4 yamt-nfs-mp-base3 nick-hppapmap-base4 nick-hppapmap-base3 jym-xensuspend-base nick-hppapmap-base
# 1.389 19-Apr-2009 ad

call rw_obj_init()


# 1.388 07-Apr-2009 tsutsui

Explicitly pass a specific buffer length to format_bytes() to
make it print memory sizes in humanized readable digits.


# 1.387 02-Apr-2009 ad

banner: fix a minor bug.


# 1.386 29-Mar-2009 ad

Remove debug code from previous.


# 1.385 29-Mar-2009 ad

Add a cental banner() function to print the copyright and version info.


# 1.384 21-Mar-2009 ad

PR port-i386/40143 Viewing an mpeg transport stream with mplayer causes crash

Fix numerous problems:

1. LDT updates are not atomic.

2. Number of processes running with private LDTs and/or I/O bitmaps
is not capped. System with high maxprocs can be paniced.

3. LDTR can be leaked over context switch.

4. GDT slot allocations can race, giving the same LDT slot to two procs.

5. Incomplete interrupt/trap frames can be stacked.

6. In some rare cases segment faults are not handled correctly.


# 1.383 05-Mar-2009 yamt

main: disable kerenel preemption during early on boot. namely, between
configure() and configure2(). some kernel threads are not expected
to be run before "cold = 0". fixes cache_thread() busy-loop.


Revision tags: nick-hppapmap-base2
# 1.382 13-Feb-2009 apb

Use "defopt MODULAR" in sys/conf/files, and #include "opt_modular.h"
in all kernel sources that use the MODULAR option.
Proposed in tech-kern on 18 Jan 2009.


# 1.381 12-Feb-2009 christos

Unbreak ssp kernels. The issue here that when the ssp_init() call was deferred,
it caused the return from the enclosing function to break, as well as the
ssp return on i386. To fix both issues, split configure in two pieces
the one before calling ssp_init and the one after, and move the ssp_init()
call back in main. Put ssp_init() in its own file, and compile this new file
with -fno-stack-protector. Tested on amd64.
XXX: If we want to have ssp kernels working on 5.0, this change needs to
be pulled up.


Revision tags: mjf-devfs2-base
# 1.380 11-Jan-2009 christos

branches: 1.380.2;
merge christos-time_t


Revision tags: christos-time_t-nbase christos-time_t-base
# 1.379 01-Jan-2009 pooka

* unexpose kprintf locking internals
* migrate from simplelock to kmutex

Don't bother to bump kernel version, since nothing outside of subr_prf
used KPRINTF_MUTEX_ENXIT()


Revision tags: haad-dm-base2 haad-nbase2 haad-dm-base
# 1.378 07-Dec-2008 pooka

Move some sysctl node creations away from linksets and into the
constructors for subsystems.

XXX: CTLFLAG_PERMANENT is non-sensible.


Revision tags: ad-audiomp2-base
# 1.377 04-Dec-2008 he

Ksyms are optional, so make the call to ksyms_init() dependent
on the same conditionals which are defined in sys/conf/files.


# 1.376 30-Nov-2008 martin

As discussed on tech-kern: mutex_init is too heavyweight for early bootstrap
phases, so move the initialization of the ksyms mutex back into main via
a function called ksyms_init. Rename the existing (but quite different)
ksyms_init* variations into ksyms_addsyms_elf() and ksyms_addsyms_explicit()
and adapt machdep code accordingly.


# 1.375 18-Nov-2008 pooka

cwd is logically a vfs concept, so take it out from the bosom of
kern_descrip and into vfs_cwd. No functional change.


# 1.374 14-Nov-2008 ad

Make POSIX AIO loadable as a module.


# 1.373 12-Nov-2008 ad

Allow the POSIX semaphore code to be loaded as a module.


# 1.372 12-Nov-2008 ad

Remove LKMs and switch to the module framework, pass 1.

Proposed on tech-kern@.


Revision tags: netbsd-5-0-RC2 netbsd-5-0-RC1 netbsd-5-base
# 1.371 28-Oct-2008 tsutsui

branches: 1.371.2;
On the prompt for init path, print a simple usage line
if input strings are not valid path or command.
Per comments from perry@ and pgoyette@.


# 1.370 25-Oct-2008 tsutsui

branches: 1.370.2;
- if no usable init(8) program (listed in *initpaths[]) can be found,
set the RB_ASKNAME flag and prompt users for the init path, rather than
panicking with "no init".
- when prompting for the init path, support the special strings
"halt", "reboot", and "ddb", as well as a prompt for the root device.

Dissussed and no objection on tech-kern. Changes summary by apb@.


Revision tags: matt-mips64-base2
# 1.369 23-Oct-2008 christos

don't expose ksyms_lock


# 1.368 21-Oct-2008 matt

Only init ksyms mutex if ksyms is present in the kernel


# 1.367 20-Oct-2008 ad

PR kern/38814 ksyms needs locking

- Make ksyms MT safe.
- Fix deadlock from an operation like "modload foo.lkm < /dev/ksyms".
- Fix uninitialized structure members.
- Reduce memory footprint for loaded modules.
- Export ksyms structures for kernel grovellers like savecore.
- Some KNF.


Revision tags: haad-dm-base1
# 1.366 18-Oct-2008 rmind

- Initialize pool subsystem and kmem(9) earlier, when UVM is up enough.
- Remove uao_hashinit() workaround used for anon-objects.
- Replace malloc with kmem.

OK by <yamt>.


# 1.365 11-Oct-2008 pooka

Move uidinfo to its own module in kern_uidinfo.c and include in rump.
No functional change to uidinfo.


Revision tags: wrstuden-revivesa-base-4
# 1.364 09-Oct-2008 pooka

Rewrite once to use global locks and atomic ops to get rid of the
static simplelock initializer (and simplelock too). The fastpath
is still lockless, so doesn't make a difference in terms of
performance.

Also fixes a hanging bug if the once routine returned an error.
It does not retry after an error occurs, as I can't really imagine
fruitful semantics for that.


Revision tags: wrstuden-revivesa-base-3 wrstuden-revivesa-base-2
# 1.363 30-Aug-2008 reinoud

Revert previous change and clarify meaning of RNG


# 1.362 30-Aug-2008 reinoud

Fix simple typo:
- rnd_init(); /* initialize RNG */
+ rnd_init(); /* initialize RND */


# 1.361 31-Jul-2008 simonb

Merge the simonb-wapbl branch. From the original branch commit:

Add Wasabi System's WAPBL (Write Ahead Physical Block Logging)
journaling code. Originally written by Darrin B. Jewell while
at Wasabi and updated to -current by Antti Kantee, Andy Doran,
Greg Oster and Simon Burge.

OK'd by core@, releng@.


Revision tags: wrstuden-revivesa-base-1 simonb-wapbl-nbase simonb-wapbl-base wrstuden-revivesa-base
# 1.360 18-Jun-2008 yamt

branches: 1.360.2;
merge yamt-pf42 branch.
(import newer pf from OpenBSD 4.2)

ok'ed by peter@. requested by core@


Revision tags: yamt-pf42-base4
# 1.359 16-Jun-2008 ad

PR kern/38927: processes getting stuck in uvm_map (cv_timedwait), hanging
machine

Assume that a vnode (and associated data structures) costs 2kB in the
worst imaginable case. Don't allow sysctl to set desiredvnodes to a
value that would use more than 75% of KVA or 75% of physical memory.


Revision tags: yamt-pf42-base3
# 1.358 31-May-2008 ad

branches: 1.358.2;
- Put in place module compatibility check against __NetBSD_Version__,
as discussed on tech-kern.

- Remove unused module_jettison().


# 1.357 28-May-2008 ad

PR kern/38355 lockf deadlock detection is broken after vmlocking

- Fix it; tested with Sun's libMicro.
- Use pool_cache.
- Use a global lock, so the deadlock detection code is safer.


# 1.356 27-May-2008 ad

Replace a couple of tsleep calls with cv_wait.


Revision tags: hpcarm-cleanup-nbase yamt-pf42-base2 yamt-nfs-mp-base2
# 1.355 01-May-2008 ad

branches: 1.355.2;
Get the pre-loaded module code working.


# 1.354 28-Apr-2008 martin

Remove clause 3 and 4 from TNF licenses


Revision tags: yamt-nfs-mp-base
# 1.353 24-Apr-2008 ad

branches: 1.353.2;
Merge proc::p_mutex and proc::p_smutex into a single adaptive mutex, since
we no longer need to guard against access from hardware interrupt handlers.

Additionally, if cloning a process with CLONE_SIGHAND, arrange to have the
child process share the parent's lock so that signal state may be kept in
sync. Partially addresses PR kern/37437.


# 1.352 24-Apr-2008 ad

Network protocol interrupts can now block on locks, so merge the globals
proclist_mutex and proclist_lock into a single adaptive mutex (proc_lock).
Implications:

- Inspecting process state requires thread context, so signals can no longer
be sent from a hardware interrupt handler. Signal activity must be
deferred to a soft interrupt or kthread.

- As the proc state locking is simplified, it's now safe to take exit()
and wait() out from under kernel_lock.

- The system spends less time at IPL_SCHED, and there is less lock activity.


# 1.351 24-Apr-2008 sborrill

It's only a typo in a comment, but it reduces the number of diffs in my local
tree :-)


# 1.350 21-Apr-2008 ad

timer fixes for PR 37093:

- Fix serious concurrency problems, making the code MT and MP safe in
the process.
- Don't allocate memory or inspect process state from hardclock().


Revision tags: yamt-pf42-baseX yamt-pf42-base
# 1.349 14-Apr-2008 ad

branches: 1.349.2;
SSP: block interrupts when enabling, and move the init to just before
starting secondary processors.


# 1.348 01-Apr-2008 ad

Use multiple kthreads to process config_interrupts tasks. Proposed on
tech-kern.


# 1.347 27-Mar-2008 ad

branches: 1.347.2;
Add code for dynamically allocated mutexes, as posted on tech-kern.


Revision tags: ad-socklock-base1
# 1.346 25-Mar-2008 yamt

- for some ports, especially for ones without pmap_growkernel,
buf_memcalc is used by bootstrap as well. fix NULL dereference for them.
- limit kva usage for each cache to 20% of vm_map. XXX a bit arbitrary.
- add a comment.


Revision tags: yamt-lazymbuf-base15 yamt-lazymbuf-base14
# 1.345 23-Mar-2008 yamt

when calculating some cache sizes, consider the amount of available kva.
PR/33185.


# 1.344 22-Mar-2008 ad

Commit the "per-CPU" select patch. This is the result of much work and
testing by rmind@ and myself.

Which approach to use is still being discussed, but I would like to get
this out of my working tree. If we decide to use a different approach
there is no problem with revisiting this.


# 1.343 21-Mar-2008 ad

Catch up with descriptor handling changes. See kern_descrip.c revision
1.173 for details.


Revision tags: keiichi-mipv6-nbase keiichi-mipv6-base matt-armv6-nbase
# 1.342 09-Mar-2008 rmind

Remove include of sys/pset.h in sys/lwp.h header.
Include it in few appropriate sources.


Revision tags: nick-net80211-sync-base bouyer-xeni386-nbase mjf-devfs-base hpcarm-cleanup-base
# 1.341 20-Jan-2008 joerg

branches: 1.341.2; 1.341.6;
Now that __HAVE_TIMECOUNTER and __HAVE_GENERIC_TODR are invariants,
remove the conditionals and the code associated with the undef case.


Revision tags: bouyer-xeni386-base
# 1.340 16-Jan-2008 ad

Pull in my modules code for review/test/hacking.


# 1.339 15-Jan-2008 rmind

Implementation of processor-sets, affinity and POSIX real-time extensions.
Add schedctl(8) - a program to control scheduling of processes and threads.

Notes:
- This is supported only by SCHED_M2;
- Migration of LWP mechanism will be revisited;

Proposed on: <tech-kern>. Reviewed by: <ad>.


# 1.338 14-Jan-2008 yamt

add a per-cpu storage allocator.


# 1.337 12-Jan-2008 ad

Remove curlwp check, all ports should hopefully be doing the right thing
now (NOTICE: curlwp should be set before main()).


Revision tags: matt-armv6-base
# 1.336 02-Jan-2008 ad

Merge vmlocking2 to head.


# 1.335 31-Dec-2007 ad

Remove systrace. Ok core@.


# 1.334 27-Dec-2007 elad

Call pax_init() for PAX_ASLR.


Revision tags: vmlocking2-base3
# 1.333 26-Dec-2007 ad

Merge more changes from vmlocking2, mainly:

- Locking improvements.
- Use pool_cache for more items.


# 1.332 22-Dec-2007 yamt

use binuptime for l_stime/l_rtime.


Revision tags: yamt-kmem-base3 cube-autoconf-base yamt-kmem-base2 yamt-kmem-base jmcneill-pm-base
# 1.331 08-Dec-2007 pooka

branches: 1.331.2; 1.331.4;
Remove cn_lwp from struct componentname. curlwp should be used
from on. The NDINIT() macro no longer takes the lwp parameter and
associates the credentials of the calling thread with the namei
structure.


Revision tags: vmlocking2-base2 reinoud-bufcleanup-nbase vmlocking2-base1 bouyer-xenamd64-base2 vmlocking-nbase bouyer-xenamd64-base reinoud-bufcleanup-base
# 1.330 15-Nov-2007 ad

branches: 1.330.2;
Add a bit of locking around timecounter attachment / selection.


# 1.329 14-Nov-2007 ad

Boot the secondary processors just before the interrupt-enabled section
of autoconfig. This is needed if APs are able to take interrupts.


# 1.328 14-Nov-2007 yamt

call debug_init earlier. ie. before malloc is used.


# 1.327 07-Nov-2007 matt

Use C99 structures initializers when possible.
[from matt-armv6]


# 1.326 07-Nov-2007 ad

Merge tty changes from the vmlocking branch.


# 1.325 07-Nov-2007 ad

Merge from vmlocking.


Revision tags: jmcneill-base
# 1.324 06-Nov-2007 ad

Merge scheduler changes from the vmlocking branch. All discussed on
tech-kern:

- Invert priority space so that zero is the lowest priority. Rearrange
number and type of priority levels into bands. Add new bands like
'kernel real time'.
- Ignore the priority level passed to tsleep. Compute priority for
sleep dynamically.
- For SCHED_4BSD, make priority adjustment per-LWP, not per-process.


# 1.323 19-Oct-2007 ad

branches: 1.323.2;
machine/{bus,cpu,intr}.h -> sys/{bus,cpu,intr}.h


Revision tags: yamt-x86pmap-base4
# 1.322 15-Oct-2007 ad

branches: 1.322.2;
Add _SC_NPROCESSORS_ONLN and _SC_NPROCESSORS_CONF for sysconf(). These
are extensions but are provided by many Unix systems.


Revision tags: yamt-x86pmap-base3 vmlocking-base
# 1.321 11-Oct-2007 ad

Merge from vmlocking:

- G/C spinlockmgr() and simple_lock debugging.
- Always include the kernel_lock functions, for LKMs.
- Slightly improved subr_lockdebug code.
- Keep sizeof(struct lock) the same if LOCKDEBUG.


# 1.320 08-Oct-2007 ad

Merge run time accounting changes from the vmlocking branch. These make
the LWP "start time" per-thread instead of per-CPU.


# 1.319 08-Oct-2007 ad

Merge file descriptor locking, cwdi locking and cross-call changes
from the vmlocking branch.


Revision tags: yamt-x86pmap-base2
# 1.318 01-Oct-2007 martin

No need to db_init_commands() early any more - it will happen on first
entry to ddb.


# 1.317 25-Sep-2007 ad

Make previous conditional upon !__i386__ && !__x86_64__. I know this is
gross but it's a debug check that's not intended to live very long.
curlwp is about to become a function on x86 (and so can't be assigned to).


# 1.316 25-Sep-2007 ad

If curlwp is not set before main(), moan about it but continue to set it.
curlwp will need to be available earlier for UVM/pmap bootstrap.


# 1.315 24-Sep-2007 martin

db_init_commands() early


Revision tags: nick-csl-alignment-base5 yamt-x86pmap-base
# 1.314 07-Sep-2007 rmind

branches: 1.314.2;
Implementation of POSIX message queues.

Reviewed by: <ad>, <tech-kern>


# 1.313 02-Sep-2007 xtraeme

Convert the sysmon watchdog framework to use mutex(9) rather than
simple_locks and initialize them on init_main via sysmon_wdog_init().

All the sysmon code now is cleaned up and doesn't use old style locking.


Revision tags: matt-mips64-base
# 1.312 04-Aug-2007 ad

branches: 1.312.2; 1.312.4;
Add cpuctl(8). For now this is not much more than a toy for debugging and
benchmarking that allows taking CPUs online/offline.


# 1.311 30-Jul-2007 pooka

branches: 1.311.4;
move setrootfstime() from init_main.c to vfs_subr2.c


# 1.310 21-Jul-2007 xtraeme

Convert sysmon_taskqueue to use mutex(9) and condvar(9) and initialize
them in init_main.c via sysmon_task_queue_preinit().

Reviewed and ok by ad@.


# 1.309 21-Jul-2007 ad

Replace some uses of lockmgr().


# 1.308 20-Jul-2007 tsutsui

Defer callout_startup2() (which calls softintr_establish(9)) call
after cpu_configure(9) for now because softintr(9) is initialized
in cpu_configure(9) on some ports.

Ok'ed by ad@ on current-users, and fixes hangs on m68k ports
during scsi probe.


Revision tags: nick-csl-alignment-base mjf-ufs-trans-base
# 1.307 09-Jul-2007 ad

branches: 1.307.2;
Merge some of the less invasive changes from the vmlocking branch:

- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements


# 1.306 09-Jul-2007 ad

Remove kcont:

- There are no users in tree.
- Its functionality has largely been replaced by workqueues and generic
soft interrupts.
- It's not MP friendly.


# 1.305 01-Jul-2007 xtraeme

Imported envsys 2, a brief description of the new features:
(Part 1: API)

* Support for detachable sensors.
* Cleaned up the API for simplicity and efficiency.
* Ability to send capacity/critical/warning events to powerd(8).
* Adapted all the code to the new locking order.
* Compatibility with the old envsys API: the ENVSYS_GTREINFO
and ENVSYS_GTREDATA ioctl(2)s are supported.
* Added support for a 'dictionary based communication channel' between
sysmon_power(9) and powerd(8), that means there is no 32 bytes event
size restriction anymore.
* Binary compatibility with old envstat(8) and powerd(8) via COMPAT_40.
* All drivers with the n^2 gtredata bug were fixed, PR kern/36226.

Tested by:

blymn: smsc(4).
bouyer: ipmi(4), mfi(4).
kefren: ug(4).
njoly: viaenv(4), adt7463.c.
riz: owtemp(4).
xtraeme: acpiacad(4), acpibat(4), acpitz(4), aiboost(4), it(4), lm(4).


# 1.304 17-Jun-2007 yamt

periodically resize vmem hash table.


# 1.303 31-May-2007 rmind

- Make aio_worker to handle pending exits and coredumps
- Allow aio_suspend() to be ended early by a signal
- Fix reference counting on LIO structures (remove hack)
- Use two global pools for AIO structures
- Minor cleanups

Patch provided by <ad>. Some additional modifications by me.
Reviewed by <yamt>.


# 1.302 17-May-2007 yamt

merge yamt-idlelwp branch. asked by core@. some ports still needs work.

from doc/BRANCHES:

idle lwp, and some changes depending on it.

1. separate context switching and thread scheduling.
(cf. gmcgarry_ctxsw)
2. implement idle lwp.
3. clean up related MD/MI interfaces.
4. make scheduler(s) modular.


Revision tags: yamt-idlelwp-base8 thorpej-atomic-base
# 1.301 13-Mar-2007 ad

Don't call pipe_init if PIPE_SOCKETPAIR is defined.


# 1.300 12-Mar-2007 ad

Put a lock around pipe->pipe_peer.


# 1.299 10-Mar-2007 ad

branches: 1.299.2; 1.299.4;
lockdebug:

- Initialize on the first allocation.
- Handle overflow better. PR kern/35723.


# 1.298 09-Mar-2007 ad

- Make the proclist_lock a mutex. The write:read ratio is unfavourable,
and mutexes are cheaper use than RW locks.
- LOCK_ASSERT -> KASSERT in some places.
- Hold proclist_lock/kernel_lock longer in a couple of places.


# 1.297 04-Mar-2007 christos

Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.


# 1.296 03-Mar-2007 itohy

Remove extra space so that symbol renaming works properly.


Revision tags: ad-audiomp-base
# 1.295 17-Feb-2007 pavel

Change the process/lwp flags seen by userland via sysctl back to the
P_*/L_* naming convention, and rename the in-kernel flags to avoid
conflict. (P_ -> PK_, L_ -> LW_ ). Add back the (now unused) LSDEAD
constant.

Restores source compatibility with pre-newlock2 tools like ps or top.

Reviewed by Andrew Doran.


# 1.294 15-Feb-2007 ad

branches: 1.294.2;
Count the number of CPUs at boot and stash in 'ncpu'. Eventually should
have each CPU register at attach, so we can figure out the topology for
the scheduler.


# 1.293 11-Feb-2007 yamt

remove a duplicated inclusion of sleepq.h.


Revision tags: post-newlock2-merge
# 1.292 09-Feb-2007 ad

Merge newlock2 to head.


Revision tags: newlock2-nbase newlock2-base
# 1.291 27-Jan-2007 elad

Add a comment to indicate the reason for kauth_init() and secmodel_start()
being where they are. Suggested by and okay christos@.


# 1.290 27-Jan-2007 elad

Start the security model sooner.

As with previous commit, we want to allow the secmodel code to control
the credential inheritance, etc., so we need it started earlier (also
before proc0_init()).


# 1.289 26-Jan-2007 elad

Initialize kauth(9) sooner.

Since we'll soon want to be able to control the inheritance of credentials,
kauth(9) needs to be ready for use much sooner -- at least before the call
to proc0_init().


# 1.288 19-Jan-2007 hannken

New file system suspension API to replace vn_start_write and vn_finished_write.
The suspension helpers are now put into file system specific operations.
This means every file system not supporting these helpers cannot be suspended
and therefore snapshots are no longer possible.

Implemented for file systems of type ffs.

The new API is enabled on a kernel option NEWVNGATE. This option is
not enabled by default in any kernel config.

Presented and discussed on tech-kern with much input from
Bill Studenmund <wrstuden@netbsd.org> and YAMAMOTO Takashi <yamt@netbsd.org>.

Welcome to 4.99.9 (new vfs op vfs_suspendctl).


# 1.287 17-Jan-2007 elad

Oops - this should have gone in a long time ago.

Weak alias secmodel_start to a nop routine, for building without a secmodel
in the kernel.


# 1.286 21-Dec-2006 yamt

merge yamt-splraiseipl branch.

- finish implementing splraiseipl (and makeiplcookie).
http://mail-index.NetBSD.org/tech-kern/2006/07/01/0000.html
- complete workqueue(9) and fix its ipl problem, which is reported
to cause audio skipping.
- fix netbt (at least compilation problems) for some ports.
- fix PR/33218.


Revision tags: yamt-splraiseipl-base5 yamt-splraiseipl-base4
# 1.285 11-Dec-2006 yamt

- remove a static configuration, FILEASSOC_NHOOKS. do it dynamically instead.
- make fileassoc_t a pointer and remove FILEASSOC_INVAL.
- clean up kern_fileassoc.c. unify duplicated code.
- unexport fileassoc_init using RUN_ONCE(9).
- plug memory leaks in fileassoc_file_delete and fileassoc_table_delete.
- always call callbacks, regardless of the value of the associated data.

ok'ed by elad.


Revision tags: yamt-splraiseipl-base3
# 1.284 07-Dec-2006 ad

iostat: avoid sleeping with a held simple_lock.


Revision tags: netbsd-4-0-1-RELEASE wrstuden-fixsa-newbase wrstuden-fixsa-base-1 netbsd-4-0-RELEASE netbsd-4-0-RC5 matt-nb4-arm-base netbsd-4-0-RC4 netbsd-4-0-RC3 netbsd-4-0-RC2 netbsd-4-0-RC1 wrstuden-fixsa-base netbsd-4-base
# 1.283 26-Nov-2006 elad

I wanted to do this for so long: veriexec_init_fp_ops() -> veriexec_init().


# 1.282 22-Nov-2006 elad

Initial implementation of PaX Segvguard (this is still work-in-progress,
it's just to get it out of my local tree).


# 1.281 22-Nov-2006 elad

Make PaX MPROTECT use specificdata(9), freeing up two P_* flags.
While here, make more generic for upcoming PaX features.


# 1.280 11-Nov-2006 christos

Add SSP support.
XXX: This is broken for me right now, because my kernel resets after fxp0
is probed, but it could be some bug in the driver/compiler.


Revision tags: yamt-splraiseipl-base2
# 1.279 08-Oct-2006 thorpej

Add specificdata support to procs and lwps, each providing their own
wrappers around the speicificdata subroutines. Also:
- Call the new lwpinit() function from main() after calling procinit().
- Move some pool initialization out of kern_proc.c and into files that
are directly related to the pools in question (kern_lwp.c and kern_ras.c).
- Convert uipc_sem.c to proc_{get,set}specific(), and eliminate the p_ksems
member from struct proc.


# 1.278 02-Oct-2006 elad

Move the kauth_init() call above auto-configuration; this will fix some
recent bugs introduced with the usage of kauth(9) in MD/device code.

While here, change the sanity checks to KASSERT(), because they're really
bugs we should fix if triggered.


Revision tags: yamt-splraiseipl-base yamt-pdpolicy-base9
# 1.277 08-Sep-2006 elad

branches: 1.277.2;
First take at security model abstraction.

- Add a few scopes to the kernel: system, network, and machdep.

- Add a few more actions/sub-actions (requests), and start using them as
opposed to the KAUTH_GENERIC_ISSUSER place-holders.

- Introduce a basic set of listeners that implement our "traditional"
security model, called "bsd44". This is the default (and only) model we
have at the moment.

- Update all relevant documentation.

- Add some code and docs to help folks who want to actually use this stuff:

* There's a sample overlay model, sitting on-top of "bsd44", for
fast experimenting with tweaking just a subset of an existing model.

This is pretty cool because it's *really* straightforward to do stuff
you had to use ugly hacks for until now...

* And of course, documentation describing how to do the above for quick
reference, including code samples.

All of these changes were tested for regressions using a Python-based
testsuite that will be (I hope) available soon via pkgsrc. Information
about the tests, and how to write new ones, can be found on:

http://kauth.linbsd.org/kauthwiki

NOTE FOR DEVELOPERS: *PLEASE* don't add any code that does any of the
following:

- Uses a KAUTH_GENERIC_ISSUSER kauth(9) request,
- Checks 'securelevel' directly,
- Checks a uid/gid directly.

(or if you feel you have to, contact me first)

This is still work in progress; It's far from being done, but now it'll
be a lot easier.

Relevant mailing list threads:

http://mail-index.netbsd.org/tech-security/2006/01/25/0011.html
http://mail-index.netbsd.org/tech-security/2006/03/24/0001.html
http://mail-index.netbsd.org/tech-security/2006/04/18/0000.html
http://mail-index.netbsd.org/tech-security/2006/05/15/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/01/0000.html
http://mail-index.netbsd.org/tech-security/2006/08/25/0000.html

Many thanks to YAMAMOTO Takashi, Matt Thomas, and Christos Zoulas for help
stablizing kauth(9).

Full credit for the regression tests, making sure these changes didn't break
anything, goes to Matt Fleming and Jaime Fournier.

Happy birthday Randi! :)


Revision tags: abandoned-netbsd-4-base yamt-pdpolicy-base8 yamt-pdpolicy-base7 rpaulo-netinet-merge-pcb-base
# 1.276 26-Jul-2006 dogcow

branches: 1.276.4;
at the request of elad, as veriexec.h has returned, revert the changes
from 2006-07-25.


# 1.275 25-Jul-2006 dogcow

mechanically go through and
s,include "veriexec.h",include <sys/verified_exec.h>,
as the former has apparently gone away.


# 1.274 24-Jul-2006 elad

some fixes:
- adapt to NVERIEXEC in init_sysctl.c.
- we now need "veriexec.h" for NVERIEXEC.
- "opt_verified_exec.h" -> "opt_veriexec.h", and include it only where
it is needed.


# 1.273 22-Jul-2006 elad

deprecate the VERIFIED_EXEC option; now we only need the pseudo-device to
enable it. while here, some config file tweaks.

tons of input from cube@ (thanks!) and okay blymn@.


# 1.272 14-Jul-2006 kardel

keep NetBSD boottime semantics:
- only set at boot
- only tracking delta of set-time operations
-> will keep boottime stable across ACPI sleeps
uptime(1) will report the time since last boot


# 1.271 14-Jul-2006 elad

okay, since there was no way to divide this to two commits, here it goes..

introduce fileassoc(9), a kernel interface for associating meta-data with
files using in-kernel memory. this is very similar to what we had in
veriexec till now, only abstracted so it can be used more easily by more
consumers.

this also prompted the redesign of the interface, making it work on vnodes
and mounts and not directly on devices and inodes. internally, we still
use file-id but that's gonna change soon... the interface will remain
consistent.

as a result, veriexec went under some heavy changes to conform to the new
interface. since we no longer use device numbers to identify file-systems,
the veriexec sysctl stuff changed too: kern.veriexec.count.dev_N is now
kern.veriexec.tableN.* where 'N' is NOT the device number but rather a
way to distinguish several mounts.

also worth noting is the plugging of unmount/delete operations
wrt/fileassoc and veriexec.

tons of input from yamt@, wrstuden@, martin@, and christos@.


# 1.270 01-Jul-2006 kardel

always call ntp initialisation for timecounter systems as
the ntp code degenerates to the adjtime implementation in the
non NTP case


Revision tags: yamt-pdpolicy-base6
# 1.269 25-Jun-2006 yamt

1. implement solaris-like vmem. (still primitive, though)
2. implement solaris-like kmem_alloc/free api, using #1.
(note: this implementation is backed by kernel_map, thus can't be
used from interrupt context.)


Revision tags: chap-midi-nbase gdamore-uart-base chap-midi-base
# 1.268 09-Jun-2006 kardel

branches: 1.268.2;
re-order initialization sequence to have real counters available during autoconfig


# 1.267 07-Jun-2006 kardel

merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
time.tv_sec -> time_second
- struct timeval mono_time is gone
mono_time.tv_sec -> time_uptime
- access to time via
{get,}{micro,nano,bin}time()
get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html


Revision tags: yamt-pdpolicy-base5 simonb-timecounters-base
# 1.266 14-May-2006 elad

branches: 1.266.2;
integrate kauth.


Revision tags: yamt-pdpolicy-base4 elad-kernelauth-base
# 1.265 10-Apr-2006 onoe

Move "opt_maxuprc.h" from init_main.c to kern_proc.c, as the definition
of maxuprc has been moved to kern_proc.c (rev. 1.80).


Revision tags: yamt-pdpolicy-base3
# 1.264 30-Mar-2006 elad

Remove useless whitepsace.

This commit is dedicated to Dan Langille.


Revision tags: peter-altq-base yamt-pdpolicy-base2
# 1.263 07-Mar-2006 thorpej

branches: 1.263.2; 1.263.4;
Clean up fallout proc_is_traced_p() change:
- proc_is_traced_p() -> trace_is_enabled(), to match trace_enter() and
trace_exit().
- trace_is_enabled() becomes a real function.
- Remove unnecessary include files from various files that used to care
about KTRACE and SYSTRACE, but do no more.


Revision tags: yamt-pdpolicy-base yamt-uio_vmspace-base5
# 1.262 12-Feb-2006 chs

branches: 1.262.2;
convert "magiclinks" from a per-fs mount option to a system-wide sysctl.
as discussed on tech-kern quite some time ago.


# 1.261 24-Dec-2005 perry

branches: 1.261.2; 1.261.4; 1.261.6;
Remove leading __ from __(const|inline|signed|volatile) -- it is obsolete.


# 1.260 11-Dec-2005 christos

merge ktrace-lwp.


Revision tags: yamt-readahead-base3 ktrace-lwp-base
# 1.259 27-Nov-2005 thorpej

Overhaul how TTY line disciplines are handled:
- Replace references to linesw[0] with a ttyldisc_default() function
that returns the default ("termios") line discipline.
- The linesw[] array is gone, replaced by a linked list.
- ttyldisc_add() and ttyldisc_remove() have been replaced by
ttyldisc_attach() and ttyldisc_detach().
- Things that provide line disciplines are now responsible for
registering those disciplines with the system. The linesw
structures are no longer declared in tty_conf.c
- Line disciplines are now refcounted; a lookup causes a reference to
be held. ttyldisc_release() releases the reference. Attempts to
detach an in-use line discipline result in EBUSY.
- Fix function signature lossage in if_sl.c, if_strip.c, and tty_tb.c
that was masked by the old tty_conf.c
- tty_init() is no longer necessary; delete it and its call from main().


# 1.258 25-Nov-2005 thorpej

Statically initalize the systrace lock. systrace_init() is now not
needed on NetBSD. Remove the call from main().


# 1.257 25-Nov-2005 thorpej

Use a once control to initialize the LKM subsystem on first open. Remove
the lkm_init() call from main().


# 1.256 25-Nov-2005 thorpej

Use a once control to initialize the NFS server / client shared data
from nfs_vfs_init() or sys_nfssvc(). Remove the nfs_init() call from
main().


# 1.255 25-Nov-2005 thorpej

Use a once control to call initialize the 802.11 layer when
ieee80211_ifattach() is called. "wlan" no longer needs-flag,
and remove the ieee80211_init() call from main().


# 1.254 25-Nov-2005 thorpej

- De-couple the software crypto implementation from the rest of the
framework. There is no need to waste the space if you are only using
algoritms provided by hardware accelerators. To get the software
implementations, add "pseudo-device swcr" to your kernel config.
- Lazily initialize the opencrypto framework when crypto drivers
(either hardware or swcr) register themselves with the framework.


Revision tags: yamt-readahead-base2
# 1.253 18-Nov-2005 martin

Only call ieee80211_init() in kernels that include some wlan stuff.


# 1.252 18-Nov-2005 skrll

Resolve conflicts and adapt to NetBSD.

Thanks to dyoung@, scw@, and perry@ for help testing.

2005-08-30 15:27 avatar

Properly set ic_curchan before calling back to device driver to do channel
switching(ifconfig devX channel Y). This fix should make channel changing
works again in monitor mode.

Submitted by: sam
X-MFC-With: other ic_curchan changes

2005-08-13 18:50 sam

revert 1.64: we cannot use the channel characteristics to decide when to
do 11g erp sta accounting because b/g channels show up as false positives
when operating in 11b.

Noticed by: Michal Mertl

2005-08-13 18:31 sam

Extend acl support to pass ioctl requests through and use this to
add support for getting the current policy setting and collecting
the list of mac addresses in the acl table.

Submitted by: Michal Mertl (original version)
MFC after: 2 weeks

2005-08-10 18:42 sam

Don't use ic_curmode to decide when to do 11g station accounting,
use the station channel properties. Fixes assert failure/bogus
operation when an ap is operating in 11a and has associated stations
then switches to 11g.

Noticed by: Michal Mertl
Reviewed by: avatar
MFC after: 2 weeks

2005-08-10 17:22 sam

Clarify/fix handling of the current channel:
o add ic_curchan and use it uniformly for specifying the current
channel instead of overloading ic->ic_bss->ni_chan (or in some
drivers ic_ibss_chan)
o add ieee80211_scanparams structure to encapsulate scanning-related
state captured for rx frames
o move rx beacon+probe response frame handling into separate routines
o change beacon+probe response handling to treat the scan table
more like a scan cache--look for an existing entry before adding
a new one; this combined with ic_curchan use corrects handling of
stations that were previously found at a different channel
o move adhoc neighbor discovery by beacon+probe response frames to
a new ieee80211_add_neighbor routine

Reviewed by: avatar
Tested by: avatar, Michal Mertl
MFC after: 2 weeks

2005-08-09 11:19 rwatson

Propagate rename of IFF_OACTIVE and IFF_RUNNING to IFF_DRV_OACTIVE and
IFF_DRV_RUNNING, as well as the move from ifnet.if_flags to
ifnet.if_drv_flags. Device drivers are now responsible for
synchronizing access to these flags, as they are in if_drv_flags. This
helps prevent races between the network stack and device driver in
maintaining the interface flags field.

Many __FreeBSD__ and __FreeBSD_version checks maintained and continued;
some less so.

Reviewed by: pjd, bz
MFC after: 7 days

2005-08-08 19:46 sam

Split crypto tx+rx key indices and add a key index -> node mapping table:

Crypto changes:
o change driver/net80211 key_alloc api to return tx+rx key indices; a
driver can leave the rx key index set to IEEE80211_KEYIX_NONE or set
it to be the same as the tx key index (the former disables use of
the key index in building the keyix->node mapping table and is the
default setup for naive drivers by null_key_alloc)
o add cs_max_keyid to crypto state to specify the max h/w key index a
driver will return; this is used to allocate the key index mapping
table and to bounds check table loookups
o while here introduce ieee80211_keyix (finally) for the type of a h/w
key index
o change crypto notifiers for rx failures to pass the rx key index up
as appropriate (michael failure, replay, etc.)

Node table changes:
o optionally allocate a h/w key index to node mapping table for the
station table using the max key index setting supplied by drivers
(note the scan table does not get a map)
o defer node table allocation to lateattach so the driver has a chance
to set the max key id to size the key index map
o while here also defer the aid bitmap allocation
o add new ieee80211_find_rxnode_withkey api to find a sta/node entry
on frame receive with an optional h/w key index to use in checking
mapping table; also updates the map if it does a hash lookup and the
found node has a rx key index set in the unicast key; note this work
is separated from the old ieee80211_find_rxnode call so drivers do
not need to be aware of the new mechanism
o move some node table manipulation under the node table lock to close
a race on node delete
o add ieee80211_node_delucastkey to do the dirty work of deleting
unicast key state for a node (deletes any key and handles key map
references)

Ath driver:
o nuke private sc_keyixmap mechansim in favor of net80211 support
o update key alloc api

These changes close several race conditions for the ath driver operating
in ap mode. Other drivers should see no change. Station mode operation
for ath no longer uses the key index map but performance tests show no
noticeable change and this will be fixed when the scan table is eliminated
with the new scanning support.

Tested by: Michal Mertl, avatar, others
Reviewed by: avatar, others
MFC after: 2 weeks

2005-08-08 06:49 sam

use ieee80211_iterate_nodes to retrieve station data; the previous
code walked the list w/o locking

MFC after: 1 week

2005-08-08 04:30 sam

Cleanup beacon/listen interval handling:
o separate configured beacon interval from listen interval; this
avoids potential use of one value for the other (e.g. setting
powersavesleep to 0 clobbers the beacon interval used in hostap
or ibss mode)
o bounds check the beacon interval received in probe response and
beacon frames and drop frames with bogus settings; not clear
if we should instead clamp the value as any alteration would
result in mismatched sta+ap configuration and probably be more
confusing (don't want to log to the console but perhaps ok with
rate limiting)
o while here up max beacon interval to reflect WiFi standard

Noticed by: Martin <nakal@nurfuerspam.de>
MFC after: 1 week

2005-08-06 05:57 sam

fix debug msg typo

MFC after: 3 days

2005-08-06 05:56 sam

Fix handling of frames sent prior to a station being authorized
when operating in ap mode. Previously we allocated a node from the
station table, sent the frame (using the node), then released the
reference that "held the frame in the table". But while the frame
was in flight the node might be reclaimed which could lead to
problems. The solution is to add an ieee80211_tmp_node routine
that crafts a node that does exist in a table and so isn't ever
reclaimed; it exists only so long as the associated frame is in flight.

MFC after: 5 days

2005-07-31 07:12 sam

close a race between reclaiming a node when a station is inactive
and sending the null data frame used to probe inactive stations

MFC after: 5 days

2005-07-27 05:41 sam

when bridging internally bypass the bss node as traffic to it
must follow the normal input path

Submitted by: Michal Mertl
MFC after: 5 days

2005-07-27 03:53 sam

bandaid ni_fails handling so ap's with association failures are
reconsidered after a bit; a proper fix involves more changes to
the scanning infrastructure

Reviewed by: avatar, David Young
MFC after: 5 days

2005-07-23 01:16 sam

the AREF flag is only meaningful in ap mode; adhoc neighbors now
are timed out of the sta/neighbor table

2005-07-23 00:25 sam

o move inactivity-related debug msgs under IEEE80211_MSG_INACT
o probe inactive neighbors in adhoc mode (they don't have an
association id so previously were being timed out)

MFC after: 3 days

2005-07-22 22:11 sam

split xmit of probe request frame out into a separate routine that
takes explicit parameters; this will be needed when scanning is
decoupled from the state machine to do bg scanning

MFC after: 3 days

2005-07-22 21:48 sam

split 802.11 frame xmit setup code into ieee80211_send_setup

MFC after: 3 days

2005-07-22 18:57 sam

simplify ic_newassoc callback

MFC after: 3 days

2005-07-22 18:54 sam

simplify ieee80211_ibss_merge api

MFC after: 3 days

2005-07-22 18:50 sam

add stats we know we'll need soon and some spare fields for future expansion

MFC after: 3 days

2005-07-22 18:45 sam

simplify tim callback api

MFC after: 3 days

2005-07-22 18:42 sam

don't include 802.3 header in min frame length calculation as it may
not be present for a frag; fixes problem with small (fragmented) frames
being dropped

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:36 sam

simplify ieee80211_node_authorize and ieee80211_node_unauthorize api's

MFC after: 3 days

2005-07-22 18:31 sam

simplifiy ieee80211_send_nulldata api

MFC after: 3 days

2005-07-22 18:29 sam

simplify rate set api's by removing ic parameter (implicit in node reference)

MFC after: 3 days

2005-07-22 18:21 sam

reject association requests with a wpa/rsn ie when wpa/rsn is not
configured on the ap; previously we either ignored the ie or (possibly)
failed an assertion

Obtained from: Atheros
MFC after: 3 days

2005-07-22 18:16 sam

missed one in last commit; add device name to discard msgs

2005-07-22 18:13 sam

include device name in discard msgs

2005-07-22 18:12 sam

add diag msgs for frames discarded because the direction field is wrong

2005-07-22 18:08 sam

split data frame delivery out to a new function ieee80211_deliver_data

2005-07-22 18:00 sam

o add IEEE80211_IOC_FRAGTHRESHOLD for getting+setting the
tx fragmentation threshold
o fix bounds checking on IEEE80211_IOC_RTSTHRESHOLD

MFC after: 3 days

2005-07-22 17:55 sam

o add IEEE80211_FRAG_DEFAULT
o move default settings for RTS and frag thresholds to ieee80211_var.h

2005-07-22 17:50 sam

diff reduction against p4: define IEEE80211_FIXED_RATE_NONE and use
it instead of -1

2005-07-22 17:37 sam

add flags missed in last merge

2005-07-22 17:36 sam

Diff reduction against p4:
o add ic_flags_ext for eventual extention of ic_flags
o define/reserve flag+capabilities bits for superg,
bg scan, and roaming support
o refactor debug msg macros

MFC after: 3 days

2005-07-22 06:17 sam

send a response when an auth request is denied due to an acl;
might be better to silently ignore the frame but this way we
give stations a chance of figuring out what's wrong

2005-07-22 06:15 sam

remove excess whitespace

2005-07-22 05:55 sam

use IF_HANDOFF when bridging frames internally so if_start gets
called; fixes communication between associated sta's

MFC after: 3 days

2005-07-11 04:06 sam

Handle encrypt of arbitarily fragmented mbuf chains: previously
we bailed if we couldn't collect the 16-bytes of data required
for an aes block cipher in 2 mbufs; now we deal with it. While
here make space accounting signed so a sanity check does the
right thing for malformed mbuf chains.

Approved by: re (scottl)

2005-07-11 04:00 sam

nuke assert that duplicates real check

Reviewed by: avatar
Approved by: re (scottl)


Revision tags: yamt-readahead-pervnode yamt-readahead-perfile yamt-readahead-base yamt-vop-base3 yamt-vop-base2 thorpej-vnode-attr-base yamt-vop-base
# 1.251 05-Aug-2005 junyoung

branches: 1.251.6;
Move proc0 initialization from main() in init_main.c and proc0_insert() in
kern_proc.c into a new function proc0_init() in kern_proc.c, as suggested
on tech-kern@ days ago.


# 1.250 16-Jul-2005 christos

defopt verified_exec.


# 1.249 15-Jul-2005 simonb

White space KNF nit.


# 1.248 23-Jun-2005 thorpej

branches: 1.248.2;
Implement expansion of special "magic" strings in symlinks into
system-specific values. Submitted by Chris Demetriou in Nov 1995 (!)
in PR kern/1781, modified only slighly by me.

This is enabled on a per-mount basis with the MNT_MAGICLINKS mount
flag. It can be enabled at mountroot() time by building the kernel
with the ROOTFS_MAGICLINKS option.

The following magic strings are supported by the implementation:

@machine value of MACHINE for the system
@machine_arch value of MACHINE_ARCH for the system
@hostname the system host name, as set with sethostname()
@domainname the system domain name, as set with setdomainname()
@kernel_ident the kernel config file name
@osrelease the releaes number of the OS
@ostype the name of the OS (always "NetBSD" for NetBSD)

Example usage:

mkdir /arch/i386/bin
mkdir /arch/sparc/bin
ln -s /arch/@machine_arch/bin /bin


# 1.247 29-May-2005 christos

- add const.
- remove unnecessary casts.
- add __UNCONST casts and mark them with XXXUNCONST as necessary.


Revision tags: kent-audio2-base
# 1.246 25-Apr-2005 lukem

Move the MI printing of `copyright' to the MD cpu_startup() code
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.


# 1.245 20-Apr-2005 blymn

Rototill of the verified exec functionality.
* We now use hash tables instead of a list to store the in kernel
fingerprints.
* Fingerprint methods handling has been made more flexible, it is now
even simpler to add new methods.
* the loader no longer passes in magic numbers representing the
fingerprint method so veriexecctl is not longer kernel specific.
* fingerprint methods can be tailored out using options in the kernel
config file.
* more fingerprint methods added - rmd160, sha256/384/512
* veriexecctl can now report the fingerprint methods supported by the
running kernel.
* regularised the naming of some portions of veriexec.


Revision tags: yamt-km-base4 yamt-km-base3 netbsd-3-base yamt-km-base2 yamt-km-base
# 1.244 23-Jan-2005 chs

branches: 1.244.6;
move the call to link_pool_init() to the end of uvm_init(). needed for sun3.


Revision tags: kent-audio1-beforemerge
# 1.243 09-Jan-2005 mycroft

branches: 1.243.2;
Rework the mountroot interface so that vfs_mountroot() opens the root device
and just passes it on to the file system functions. This avoids opening and
closing the device several times.

Mentioned on tech-kern some time ago, IIRC. I've been running this for a
long time.


Revision tags: kent-audio1-base
# 1.242 15-Oct-2004 thorpej

No longer need <sys/disk.h>


# 1.241 15-Oct-2004 thorpej

- Eliminate the need to call disk_init().
- disk_count needs to be protected with disklist_slock, too.


# 1.240 01-Oct-2004 yamt

introduce a function, proclist_foreach_call, to iterate all procs on
a proclist and call the specified function for each of them.
primarily to fix a procfs locking problem, but i think that it's useful for
others as well.

while i'm here, introduce PROCLIST_FOREACH macro, which is similar to
LIST_FOREACH but skips marker entries which are used by proclist_foreach_call.


# 1.239 05-Jul-2004 pk

Call inittodr() from main(). Let file system code set the recorded `last
update' time (if any) through the new function setrootfstime().


# 1.238 03-Jun-2004 nathanw

Initialize simple_lock in struct cwd; otherwise, one gets an
uninitialized lock panic at the first use of cwdshare().


# 1.237 06-May-2004 pk

Provide a mutex for the process limits data structure.


# 1.236 25-Apr-2004 simonb

Initialise (most) pools from a link set instead of explicit calls
to pool_init. Untouched pools are ones that either in arch-specific
code, or aren't initialiased during initial system startup.

Convert struct session, ucred and lockf to pools.


Revision tags: netbsd-2-0-3-RELEASE netbsd-2-1-RELEASE netbsd-2-1-RC6 netbsd-2-1-RC5 netbsd-2-1-RC4 netbsd-2-1-RC3 netbsd-2-1-RC2 netbsd-2-1-RC1 netbsd-2-0-2-RELEASE netbsd-2-0-1-RELEASE netbsd-2-base netbsd-2-0-RELEASE netbsd-2-0-RC5 netbsd-2-0-RC4 netbsd-2-0-RC3 netbsd-2-0-RC2 netbsd-2-0-RC1 netbsd-2-0-base
# 1.235 28-Mar-2004 matt

Make kernel continuations optional for now.


# 1.234 27-Mar-2004 jonathan

Use proper NetBSD conventions for deferred kthread creation, not the
other semantics from an earlier incarnation.

Call kcont_init() from init_main before device autoconfiguration,
so kcont is availble to device drivers if required.

Also ensure the kthread process runs any pending continuations once
the kthread is finally up and running. For now, use a non-null timeout
to poll the queue periodically. Draining any pending requests just
before the kthread enters its ltsleep()/kc_run loop is cleaner, but
this is the version I tested with an early-in-boot kcont request.)


# 1.233 09-Mar-2004 junyoung

Whitespaces.


# 1.232 09-Jan-2004 tls

Bump default size of vnode cache to 1% of physical memory, instead of
0.5%, based on some quick measurements on a number of workstations and
small fileservers (including my home fileserver running simultaneous
builds of the NetBSD source tree and several NetBSD kernels). This
brings the hit rate on my machines from below 70% to above 90%. We
should be able to tune this as we run, by tracking the hit rate and
increasing the size of the cache if memory permits.

Some systems will still require significantly larger cache sizes. Some
ports -- notably the 64-bit ones -- probably should use more than 1% of
physmem as the default due to the larger size of struct vnode.


# 1.231 05-Jan-2004 lukem

Store the copyright text in conf/copyright, and use conf/newvers.sh
to generate the appropriate const char copyright[] = "...";
statement instead of hard coding it into kern/init_main.c.
Idea from Simon Burge.


# 1.230 04-Jan-2004 jdolecek

Rearrange process exit path to avoid need to free resources from different
process context ('reaper').

From within the exiting process context:
* deactivate pmap and free vmspace while we can still block
* introduce MD cpu_lwp_free() - this cleans all MD-specific context (such
as FPU state), and is the last potentially blocking operation;
all of cpu_wait(), and most of cpu_exit(), is now folded into cpu_lwp_free()
* process is now immediatelly marked as zombie and made available for pickup
by parent; the remaining last lwp continues the exit as fully detached
* MI (rather than MD) code bumps uvmexp.swtch, cpu_exit() is now same
for both 'process' and 'lwp' exit

uvm_lwp_exit() is modified to never block; the u-area memory is now
always just linked to the list of available u-areas. Introduce (blocking)
uvm_uarea_drain(), which is called to release the excessive u-area memory;
this is called by parent within wait4(), or by pagedaemon on memory shortage.
uvm_uarea_free() is now private function within uvm_glue.c.

MD process/lwp exit code now always calls lwp_exit2() immediatelly after
switching away from the exiting lwp.

g/c now unneeded routines and variables, including the reaper kernel thread


# 1.229 01-Jan-2004 mycroft

Welcome to 2004!


# 1.228 30-Dec-2003 pk

Replace the traditional buffer memory management -- based on fixed per buffer
virtual memory reservation and a private pool of memory pages -- by a scheme
based on memory pools.

This allows better utilization of memory because buffers can now be allocated
with a granularity finer than the system's native page size (useful for
filesystems with e.g. 1k or 2k fragment sizes). It also avoids fragmentation
of virtual to physical memory mappings (due to the former fixed virtual
address reservation) resulting in better utilization of MMU resources on some
platforms. Finally, the scheme is more flexible by allowing run-time decisions
on the amount of memory to be used for buffers.

On the other hand, the effectiveness of the LRU queue for buffer recycling
may be somewhat reduced compared to the traditional method since, due to the
nature of the pool based memory allocation, the actual least recently used
buffer may release its memory to a pool different from the one needed by a
newly allocated buffer. However, this effect will kick in only if the
system is under memory pressure.


# 1.227 14-Nov-2003 jonathan

include <sys/mbuf.h> before FAST_IPSEC-dependent headers.


# 1.226 04-Nov-2003 dsl

Remove p_nras from struct proc - use LIST_EMPTY(&p->p_raslist) instead.
Remove p_raslock and rename p_lwplock p_lock (one lock is enough).
Simplify window test when adding a ras and correct test on VM_MAXUSER_ADDRESS.
Avoid unpredictable branch in i386 locore.S
(pad fields left in struct proc to avoid kernel bump)


# 1.225 02-Nov-2003 jdolecek

use LIST_FOREACH() as appropriate


# 1.224 07-Aug-2003 agc

Move UCB-licensed code from 4-clause to 3-clause licence.

Patches provided by Joel Baker in PR 22364, verified by myself.


# 1.223 06-Aug-2003 jonathan

(FAST_IPSEC): Pull in option header-test for FAST_IPSEC (and IPSEC).

If FAST_IPSEC is configured, attach fast-ipsec transforms after
autoconfiguring devices (perhaps including crypto hardware)
but before starting up network-device packet input.


# 1.222 30-Jul-2003 jonathan

Move the initialization of the crypto framework from the userland
pseudo-device to init_main(), so the framework is ready for
registration requests at autoconfiguration time.

Thanks to Quentin Garnier for confirming the change was required, and
for testing a similar fix.


# 1.221 29-Jun-2003 fvdl

branches: 1.221.2;
Back out the lwp/ktrace changes. They contained a lot of colateral damage,
and need to be examined and discussed more.


# 1.220 29-Jun-2003 thorpej

Undo part of the ktrace/lwp changes. In particular:
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.


# 1.219 28-Jun-2003 darrenr

Pass lwp pointers throughtout the kernel, as required, so that the lwpid can
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.

Bump the kernel rev up to 1.6V


# 1.218 19-Mar-2003 dsl

Alternative pid/proc allocater, removes all searches associated with pid
lookup and allocation, and any dependency on NPROC or MAXUSERS.
NO_PID changed to -1 (and renamed NO_PGID) to remove artificial limit
on PID_MAX.
As discussed on tech-kern.


# 1.217 20-Jan-2003 christos

add support for p1003.1b semaphores. From FreeBSD


# 1.216 18-Jan-2003 thorpej

Merge the nathanw_sa branch.


Revision tags: nathanw_sa_before_merge fvdl_fs64_base nathanw_sa_base
# 1.215 01-Jan-2003 mycroft

Update copyright notice.


Revision tags: gmcgarry_ctxsw_base gmcgarry_ucred_base
# 1.214 11-Dec-2002 abs

branches: 1.214.2;
Define nofile and maxuprc variables (set to NOFILE and MAXUPRC), so they can
be patched in a compiled kernel.


# 1.213 05-Dec-2002 yamt

initialize uvm.aiodoned_proc.


# 1.212 24-Nov-2002 thorpej

Add an EVCNT_ATTACH_STATIC() macro which gathers static evcnts
into a link set, which are added to the list of event counters
at boot time.


# 1.211 17-Nov-2002 chs

add support for __MACHINE_STACK_GROWS_UP platforms. from fredette@


# 1.210 05-Nov-2002 thorpej

Add a new VM map, lkm_map, which machine-dependent code can provide
in the event that it needs to use a special VM range (x86_64 falls
into this category). We fall back onto kernel_map if machine-dependent
code doesn't create a special map.


Revision tags: kqueue-aftermerge
# 1.209 23-Oct-2002 jdolecek

merge kqueue branch into -current

kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals

kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)

based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe


Revision tags: kqueue-beforemerge kqueue-base
# 1.208 01-Oct-2002 thorpej

Add a generic config finalization hook, to be called once all real
devices have been discovered. All finalizer routines are iteratively
invoked until all of them report that they have done no work.

Use this hook to fix a latent bug in RAIDframe autoconfiguration of
RAID sets exposed by the rework of SCSI device discovery.


# 1.207 25-Sep-2002 thorpej

Don't include <sys/map.h>.


# 1.206 04-Sep-2002 matt

Use the queue macros from <sys/queue.h> instead of referring to the queue
members directly. Use *_FOREACH whenever possible.


# 1.205 31-Aug-2002 sommerfeld

Initialize proc0.p_raslock to avoid a lock assertion on the first fork().


Revision tags: gehenna-devsw-base
# 1.204 25-Aug-2002 thorpej

Fix a signed/unsigned comparison warning from GCC 3.3.


# 1.203 24-Aug-2002 lukem

only print "init: trying /some/init" if RB_ASKNAME or if it's not the first
path we're trying. (the intent but not the behaviour of the previous rev.)


# 1.202 23-Aug-2002 lukem

in start_init(), if RB_ASKNAME is set in boothowto, ask for the path
name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.


# 1.201 17-Jun-2002 christos

Niels Provos systrace work, ported to NetBSD by kittenz and reworked...


# 1.200 27-May-2002 itojun

re-scan all ifnet after domaininit() for if_afdata initialization.


Revision tags: netbsd-1-6-RELEASE netbsd-1-6-RC3 netbsd-1-6-RC2 netbsd-1-6-RC1 netbsd-1-6-base eeh-devprop-base newlock-base
# 1.199 04-Mar-2002 simonb

branches: 1.199.2; 1.199.6; 1.199.8;
Use <sys/disk.h> for the prototype of disk_init() rather than declaring
our own locally.


Revision tags: ifpoll-base
# 1.198 11-Feb-2002 jdolecek

Switch default for pipes to the faster John S. Dyson's implementation.
Old, socketpair-based ones are available with option PIPE_SOCKETPAIR.


# 1.197 01-Jan-2002 perry

Happy New Year!


Revision tags: thorpej-mips-cache-base
# 1.196 12-Nov-2001 lukem

add RCSIDs


Revision tags: thorpej-devvp-base3 thorpej-devvp-base2 post-chs-ubcperf pre-chs-ubcperf thorpej-devvp-base
# 1.195 16-Aug-2001 chs

branches: 1.195.4;
user maps are always pageable.


# 1.194 18-Jul-2001 matt

When we auto size the vnode cache, make sure we do it *before* we
init vfs so it can the size into account when creating its hash lists.
This means that for a 2GB system, it'll have a default of 65536 buckets
instead of 2048 and when you have 200,000+ vnodes that makes a significant
difference.


# 1.193 15-Jul-2001 jdolecek

Remove initial newline from copyright[], which was mistakely added in rev.1.191.
Fixes kern/13470 by Tetsuya Isaki.


# 1.192 16-Jun-2001 jdolecek

branches: 1.192.2;
Add port of high performance pipe implementation written by John S. Dyson
for FreeBSD project. Besides huge speed boost compared with socketpair-based
pipes, this implementation also uses pagable kernel memory instead of mbufs.

Significant differences to FreeBSD version:
* uses uvm_loan() facility for direct write
* async/SIGIO handling correct also for sync writer, async reader
* limits settable via sysctl, amountpipekva and nbigpipes available via sysctl
* pipes are unidirectional - this is enforced on file descriptor level
for now only, the code would be updated to take advantage of it
eventually
* uses lockmgr(9)-based locks instead of home brew variant
* scatter-gather write is handled correctly for direct write case, data
is transferred by PIPE_DIRECT_CHUNK bytes maximum, to avoid running out of kva

All FreeBSD/NetBSD specific code is within appropriate #ifdef, in preparation
to feed changes back to FreeBSD tree.

This pipe implementation is optional for now, add 'options NEW_PIPE'
to your kernel config to use it.


# 1.191 08-Jun-2001 mrg

use real \n's copyright[]; avoids gcc 3.0-prerelease warnings.


Revision tags: thorpej_scsipi_beforemerge thorpej_scsipi_nbase thorpej_scsipi_base
# 1.190 13-Apr-2001 thorpej

Remove the use of splimp() from the NetBSD kernel. splnet()
and only splnet() is allowed for the protection of data structures
used by network devices.


# 1.189 15-Mar-2001 chs

eliminate the KERN_* error codes in favor of the traditional E* codes.
the mapping is:

KERN_SUCCESS 0
KERN_INVALID_ADDRESS EFAULT
KERN_PROTECTION_FAILURE EACCES
KERN_NO_SPACE ENOMEM
KERN_INVALID_ARGUMENT EINVAL
KERN_FAILURE various, mostly turn into KASSERTs
KERN_RESOURCE_SHORTAGE ENOMEM
KERN_NOT_RECEIVER <unused>
KERN_NO_ACCESS <unused>
KERN_PAGES_LOCKED <unused>


# 1.188 01-Jan-2001 thorpej

branches: 1.188.2;
Happy new year!


# 1.187 11-Dec-2000 mycroft

Introduce 2 new flags in types.h:
* __HAVE_SYSCALL_INTERN. If this is defined, e_syscall is replaced by
e_syscall_intern, which is called at key places in the kernel. This can be
used to set a MD syscall handler pointer. This obsoletes and replaces the
*_HAS_SEPARATED_SYSCALL flags.
* __HAVE_MINIMAL_EMUL. If this is defined, certain (deprecated) elements in
struct emul are omitted.


# 1.186 08-Dec-2000 jdolecek

call exec_init() with before letting init(8) exec


# 1.185 27-Nov-2000 chs

Initial integration of the Unified Buffer Cache project.


# 1.184 21-Nov-2000 jdolecek

restructure struct emul and execsw, in preparation to make emulations LKMable:
* move all exec-type specific information from struct emul to execsw[] and
provide single struct emul per emulation
* elf:
- kern/exec_elf32.c:probe_funcs[] is gone, execsw[] how has one entry
per emulation and contains pointer to respective probe function
- interp is allocated via MALLOC() rather than on stack
- elf_args structure is allocated via MALLOC() rather than malloc()
* ecoff: the per-emulation hooks moved from alpha and mips specific code
to OSF1 and Ultrix compat code as appropriate, execsw[] has one entry per
emulation supporting ecoff with appropriate probe function
* the makecmds/probe functions don't set emulation, pointer to emulation is
part of appropriate execsw[] entry
* constify couple of structures


# 1.183 13-Nov-2000 jdolecek

change the type of *syscallnames[] array to 'const char * const foo[]'


# 1.182 29-Oct-2000 he

Use an rlim_t to store "available memory", so we don't needlessly
overflow and/or sign extend.


# 1.181 13-Sep-2000 thorpej

Add an align argument to uvm_map() and some callers of that
routine. Works similarly fto pmap_prefer(), but allows callers
to specify a minimum power-of-two alignment of the region.
How we ever got along without this for so long is beyond me.


# 1.180 26-Aug-2000 sommerfeld

More MP clock/scheduler changes:
- Periodically invoke roundrobin() from hardclock() on all cpu's rather
than from a timer callout; this allows time-slicing on non-primary cpu's.
- Make pscnt per-cpu.
- Notice psdiv changes on each cpu, and adjust pscnt at that point.
Also, invoke setstatclockrate() from the clock interrupt when each cpu
notices the divisor change, rather than when starting/stopping the
profiling clock.


# 1.179 22-Aug-2000 thorpej

Define the MI parts of the "big kernel lock" perimeter. From
Bill Sommerfeld.


# 1.178 21-Aug-2000 thorpej

splhigh() -> splsched()


# 1.177 12-Aug-2000 thorpej

Don't bother with a trampoline to start the pagedaemon and
reaper threads.


# 1.176 14-Jul-2000 thorpej

- Fix the likely cause of the "ps(1) hangs machine" problem. Always
vslock the user pages for the data being copied out to userspace,
so that we won't sleep while holding a lock in case we need to
fault the pages in.
- Sprinkle some const and ANSI'ify some things while here.


# 1.175 06-Jul-2000 jdolecek

adjust maximum number of vnodes in vnode cache according
to machine memory size upon boot if the number has not been specified
explicitly in kernel config - at this moment, 0.5% of system
memory is used for vnodes (but minimum NVNODE vnodes)


# 1.174 27-Jun-2000 mrg

remove include of <vm/vm.h>


# 1.173 25-Jun-2000 mrg

<vm/vm_pageout.h> is already empty; kill it totally.


Revision tags: netbsd-1-5-base
# 1.172 06-Jun-2000 soren

branches: 1.172.2;
defopt SYSCALL_DEBUG.


# 1.171 31-May-2000 thorpej

Track which process a CPU is running/has last run on by adding a
p_cpu member to struct proc. Use this in certain places when
accessing scheduler state, etc. For the single-processor case,
just initialize p_cpu in fork1() to avoid having to set it in the
low-level context switch code on platforms which will never have
multiprocessing.

While I'm here, comment a few places where there are known issues
for the SMP implementation.


# 1.170 28-May-2000 jhawk

Add proc0 to pidhashtbl so pfind(0) works.
Now trace/t 0 works in ddb, etc.


# 1.169 28-May-2000 thorpej

Rather than starting init and creating kthreads by forking and then
doing a cpu_set_kpc(), just pass the entry point and argument all
the way down the fork path starting with fork1(). In order to
avoid special-casing the normal fork in every cpu_fork(), MI code
passes down child_return() and the child process pointer explicitly.

This fixes a race condition on multiprocessor systems; a CPU could
grab the newly created processes (which has been placed on a run queue)
before cpu_set_kpc() would be performed.


Revision tags: minoura-xpg4dl-base
# 1.168 26-May-2000 thorpej

branches: 1.168.2;
First sweep at scheduler state cleanup. Collect MI scheduler
state into global and per-CPU scheduler state:

- Global state: sched_qs (run queues), sched_whichqs (bitmap
of non-empty run queues), sched_slpque (sleep queues).
NOTE: These may collectively move into a struct schedstate
at some point in the future.

- Per-CPU state, struct schedstate_percpu: spc_runtime
(time process on this CPU started running), spc_flags
(replaces struct proc's p_schedflags), and
spc_curpriority (usrpri of processes on this CPU).

- Every platform must now supply a struct cpu_info and
a curcpu() macro. Simplify existing cpu_info declarations
where appropriate.

- All references to per-CPU scheduler state now made through
curcpu(). NOTE: this will likely be adjusted in the future
after further changes to struct proc are made.

Tested on i386 and Alpha. Changes are mostly mechanical, but apologies
in advance if it doesn't compile on a particular platform.


# 1.167 26-May-2000 thorpej

Introduce a new process state distinct from SRUN called SONPROC
which indicates that the process is actually running on a
processor. Test against SONPROC as appropriate rather than
combinations of SRUN and curproc. Update all context switch code
to properly set SONPROC when the process becomes the current
process on the CPU.


# 1.166 24-Mar-2000 enami

Call the routine to calculate callwheelsize from allocsys() instead of
main() since some port like alpha and mips calls allocsys() before main()
is called. While I'm here, I renamed some function.


# 1.165 23-Mar-2000 thorpej

New callout mechanism with two major improvements over the old
timeout()/untimeout() API:
- Clients supply callout handle storage, thus eliminating problems of
resource allocation.
- Insertion and removal of callouts is constant time, important as
this facility is used quite a lot in the kernel.

The old timeout()/untimeout() API has been removed from the kernel.


# 1.164 10-Mar-2000 enami

Create new kernel thread to issue statfs(2) system call to check free
disk space rather than doing it in timeout handler. This fixes long
standing bug that accounting file can't be put on NFS file system (so,
e.g, we couldn't turn on accounting on diskless system).


Revision tags: chs-ubc2-newbase
# 1.163 24-Jan-2000 thorpej

Add a `config_pending' semaphore to block mounting of the root file system
until all device driver discovery threads have had a chance to do their
work. This in turn blocks initproc's exec of init(8) until root is
mounted and process start times and CWD info has been fixed up.

Addresses kern/9247.


# 1.162 19-Jan-2000 thorpej

Move callout initialization to a single location; no need to duplicate
that code all over the place.


# 1.161 01-Jan-2000 mycroft

Update for y2k.


Revision tags: wrstuden-devbsize-19991221 wrstuden-devbsize-base
# 1.160 16-Dec-1999 thorpej

Explicitly set secondary processors in motion before calling uvm_scheduler().


# 1.159 15-Nov-1999 fvdl

Add Kirk McKusick's soft updates code to the trunk. Not enabled by
default, as the copyright on the main file (ffs_softdep.c) is such
that is has been put into gnusrc. options SOFTDEP will pull this
in. This code also contains the trickle syncer.

Bump version number to 1.4O


Revision tags: fvdl-softdep-base
# 1.158 13-Nov-1999 simonb

Defopt MAXUPRC.


Revision tags: comdex-fall-1999-base
# 1.157 28-Sep-1999 bouyer

branches: 1.157.2; 1.157.4; 1.157.8;
Remplace kern.shortcorename sysctl with a more flexible sheme,
core filename format, which allow to change the name of the core dump,
and to relocate it in a directory. Credits to Bill Sommerfeld for giving me
the idea :)
The default core filename format can be changed by options DEFCORENAME and/or
kern.defcorename
Create a new sysctl tree, proc, which holds per-process values (for now
the corename format, and resources limits). Process is designed by its pid
at the second level name. These values are inherited on fork, and the corename
fomat is reset to defcorename on suid/sgid exec.
Create a p_sugid() function, to take appropriate actions on suid/sgid
exec (for now set the P_SUGID flag and reset the per-proc corename).
Adjust dosetrlimit() to allow changing limits of one proc by another, with
credential controls.


# 1.156 17-Sep-1999 thorpej

- Centralize the declaration and clearing of `cold'.
- Call configure() after setting up proc0.
- Call initclocks() from configure(), after cpu_configure(). Once the
clocks are running, clear `cold'. Then run interrupt-driven
autoconfiguration.


# 1.155 15-Sep-1999 thorpej

Rename the machine-dependent autoconfiguration entry point `cpu_configure()',
and rename config_init() to configure() and call cpu_configure() from there.


Revision tags: chs-ubc2-base
# 1.154 22-Jul-1999 thorpej

Add a read/write lock to the proclists and PID hash table. Use the
write lock when doing PID allocation, and during the process exit path.
Use a read lock every where else, including within schedcpu() (interrupt
context). Note that holding the write lock implies blocking schedcpu()
from running (blocks softclock).

PID allocation is now MP-safe.

Note this actually fixes a bug on single processor systems that was probably
extremely difficult to tickle; it was possible that schedcpu() would run
off a bad pointer if the right clock interrupt happened to come in the
middle of a LIST_INSERT_HEAD() or LIST_REMOVE() to/from allproc.


# 1.153 06-Jul-1999 thorpej

Make the kthread API a bit more friendly to loadable kernel modules.


# 1.152 07-Jun-1999 thorpej

Don't pass a nam2blk around at all; just have setroot() and friends reference
dev_name2blk[] directly. Addresses PR #7622 (ITOH Yasufumi), although
in a different way.


# 1.151 13-May-1999 thorpej

Allow the caller to specify a stack for the child process. If NULL,
the child inherits the stack pointer from the parent (traditional
behavior). Like the signal stack, the stack area is secified as
a low address and a size; machine-dependent code accounts for stack
direction.

This is required for clone(2).


# 1.150 13-May-1999 thorpej

Allow an alternate exit signal (i.e. not SIGCHLD) to be delivered to the
parent, specified at fork time. Specify a new flag to wait4(2), WALTSIG,
to wait for processes which use an alternate exit signal.

This is required for clone(2).


# 1.149 30-Apr-1999 thorpej

Pull signal actions out of struct user, make them a separate proc
substructure, and allow them to be shared.

Required for clone(2).


# 1.148 30-Apr-1999 thorpej

Break cdir/rdir/cmask info out of struct filedesc, and put it in a new
substructure, `cwdinfo'. Implement optional sharing of this substructure.

This is required for clone(2).


# 1.147 25-Apr-1999 simonb

g/c REAL_CLISTS.


# 1.146 12-Apr-1999 gwr

minor nits -- strncpy into p->p_comm


Revision tags: kame_141_19991130 netbsd-1-4-PATCH001 kame_14_19990705 kame_14_19990628 netbsd-1-4-RELEASE netbsd-1-4-base
# 1.145 01-Apr-1999 thorpej

branches: 1.145.2; 1.145.4;
Call cpu_startup() immediately after uvm_init(), but before mbinit().
Call configure() directly immediately after config_init().

This causes autoconfiguration to happen at the same time as before, but
creates some kernel submaps earlier, so that e.g. mbinit() can now
allocate memory.


# 1.144 26-Mar-1999 thorpej

Assign initproc in main(), not start_init(). It's conventient to do so.


# 1.143 24-Mar-1999 mrg

completely remove Mach VM support. all that is left is the all the
header files as UVM still uses (most of) these.


# 1.142 05-Mar-1999 mycroft

This is sort of gratuitous, but...
Strip the leading path off of init's argv[0].


# 1.141 22-Feb-1999 cjs

Safer use of printf.


# 1.140 21-Jan-1999 christos

Fix initialization of resource limits for number of files and number
of processes:
- Don't initialize rlim_max to RLIM_INFINITY. The limits for those should
be maxfiles and maxproc respectively. Programs expect getrlimit to
return reasonable values, so that they can allocate structures (for
example jdk does this).
- Don't initialize rlim_cur to NOFILE and MAXUPRC respectively, but to
min(NOFILE, maxfiles) and min(MAXUPRC, maxproc) respectively.


# 1.139 16-Jan-1999 chuck

MNN is no longer optional


# 1.138 06-Jan-1999 lukem

add copyright 1999


Revision tags: kenh-if-detach-base
# 1.137 14-Nov-1998 thorpej

Implement a way to queue kernel threads for creation after init,
pagedaemon, reaper, etc. Caller provides a callback function and
argument which will be called to create the threads.


# 1.136 11-Nov-1998 thorpej

fork_kthread() -> kthread_create(). Set P_NOCLDWAIT on kernel threads,
which will cause any of their children to be reparented to init(8) (which
is already prepared to wait out orphaned processes).


# 1.135 11-Nov-1998 thorpej

Initial version of API for creating kernel threads (likely to change somewhat
in the future):
- New function, fork_kthread(), takes entry point, argument for entry point,
and comment for new proc. May be called by any context, will fork the
thread from proc0 (requires slight changes to cpu_fork()).
- cpu_set_kpc() now takes a third argument, a void *arg to pass to the
thread entry point. Thread entry point now takes void * instead of
struct proc *.
- Create the pagedaemon and reaper kernel threads using fork_kthread().


Revision tags: chs-ubc-base
# 1.134 19-Oct-1998 tron

branches: 1.134.2;
Defopt SYSVMSG, SYSVSEM and SYSVSHM.


# 1.133 19-Oct-1998 pk

Allow `curproc' to be defined in <machine/proc.h> to enable a transition
to SMP support.


# 1.132 08-Sep-1998 thorpej

Implement a new kernel thread, the "reaper", which performs the task
of freeing the VM resources once a process has exited. A valid thread
must do this work, as doing so may block in a multi-processor environment.


# 1.131 31-Aug-1998 thorpej

Use the pool allocator and "nointr" pool page allocator for file structures.


# 1.130 13-Aug-1998 eeh

Merge paddr_t changes into the main branch.


# 1.129 04-Aug-1998 perry

Abolition of bcopy, ovbcopy, bcmp, and bzero, phase one.
bcopy(x, y, z) -> memcpy(y, x, z)
ovbcopy(x, y, z) -> memmove(y, x, z)
bcmp(x, y, z) -> memcmp(x, y, z)
bzero(x, y) -> memset(x, 0, y)


# 1.128 02-Aug-1998 thorpej

Use the pool allocator for sockets.


# 1.127 01-Aug-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach, take two.

Note that it is necessary that mbinit() NOT allocate memory, since it
is called before mb_map is created. This is not a problem with the
pool allocator that is now used for mbufs and mbuf clusters.


# 1.126 01-Aug-1998 thorpej

Oops, back out previous. I need to attack that problem differently.


# 1.125 31-Jul-1998 perry

fix sizeofs so they comply with the KNF style guide. yes, it is pedantic.


# 1.124 31-Jul-1998 thorpej

Initialize the mbuf allocator _before_ autoconfiguration; it might be
called when devices attach.


Revision tags: eeh-paddr_t-base
# 1.123 25-Jun-1998 thorpej

branches: 1.123.2;
defopt NFSSERVER


# 1.122 30-Mar-1998 mycroft

Oops; forgot to update prototype.


# 1.121 30-Mar-1998 mycroft

Argument to main() is no longer used.


# 1.120 27-Mar-1998 thorpej

Make proc0 use the statically-allocate vmspace0 again, and make it use
the kernel's pmap, since proc0 (and other that share its address space)
are kernel-only processes, and should never contain userspace mappings.

This makes it easier to detect errors, like entering user mappings
for kernel processes, in pmap modules, and makes some sense, considering
that kernel processes are really just "thread contexts" for the kernel.


# 1.119 22-Mar-1998 thorpej

Process 2 (the pagedaemon) always runs in kernel space, so share VM
space with proc0.


# 1.118 01-Mar-1998 fvdl

Merge with Lite2 + local changes


# 1.117 19-Feb-1998 thorpej

Include the NFS option header.


# 1.116 14-Feb-1998 thorpej

Prevent the session ID from disappearing if the session leader exits
(thus causing s_leader to become NULL) by storing the session ID separately
in the session structure. Export the session ID to userspace in the
eproc structure.

Submitted by Tom Proett <proett@nas.nasa.gov>.


# 1.115 10-Feb-1998 mrg

- add defopt's for UVM, UVMHIST and PMAP_NEW.
- remove unnecessary UVMHIST_DECL's.


# 1.114 05-Feb-1998 mrg

initial import of the new virtual memory system, UVM, into -current.

UVM was written by chuck cranor <chuck@maria.wustl.edu>, with some
minor portions derived from the old Mach code. i provided some help
getting swap and paging working, and other bug fixes/ideas. chuck
silvers <chuq@chuq.com> also provided some other fixes.

this is the rest of the MI portion changes.

this will be KNF'd shortly. :-)


# 1.113 08-Jan-1998 mrg

add new version of non contiguous memory code, written by chuck cranor,
called "MACHINE_NEW_NONCONGIG". this is required for UVM, the new VM
system (also written by chuck) that is coming soon. adds new functions:
vm_page_physload() -- tell the VM system about an area of memory.
vm_physseg_find() -- returns index in vm_physmem array that this
address is in.
and several new versions of old functions/macros defined in vm_page.h.

this is the MI portion. sparc, and then later i386 portions to come.
all other ports need to change to this ASAP! (alpha is already being
worked on)


# 1.112 07-Jan-1998 thorpej

Happy new year!


# 1.111 06-Jan-1998 thorpej

Clean up the forking of init and the pagedaemon slightly: call fork1()
directly (which provides a pointer to the new process).


# 1.110 06-Jan-1998 thorpej

Garbage-collect cpu_set_init_frame(); it hasn't been needed for some time
now.


# 1.109 05-Jan-1998 thorpej

Initialize proc0's file descriptor table with fdinit1().


Revision tags: netbsd-1-3-PATCH001 netbsd-1-3-RELEASE netbsd-1-3-BETA netbsd-1-3-base
# 1.108 19-Oct-1997 mycroft

branches: 1.108.2;
Add const where appropriate.


# 1.107 17-Oct-1997 thorpej

Display The NetBSD Foundation, Inc.'s copyright notice at boot time.


Revision tags: marc-pcmcia-base
# 1.106 13-Oct-1997 explorer

o Make usage of /dev/random dependant on
pseudo-device rnd # /dev/random and in-kernel generator
in config files.

o Add declaration to all architectures.

o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.


# 1.105 10-Oct-1997 mycroft

GC pageproc and bclnlist.


# 1.104 09-Oct-1997 explorer

make /dev/random standard, per message from Jason


# 1.103 09-Oct-1997 explorer

add hooks to initialize the random driver


# 1.102 11-Sep-1997 mycroft

Fix execve(2) and *setregs() interfaces so emulations can set registers in a
more correct way. (See tech-kern.)


Revision tags: thorpej-signal-base marc-pcmcia-bp
# 1.101 14-Jun-1997 thorpej

branches: 1.101.4; 1.101.6;
Call cpu_dumpconf() after cpu_rootconf().


# 1.100 12-Jun-1997 mrg

remove swap configuration.


# 1.99 16-May-1997 gwr

Eliminate vmspace.vm_pmap and all references to it unless
__VM_PMAP_HACK is defined (for temporary compatibility).
The __VM_PMAP_HACK code should be removed after all the
ports that define it have removed all vm_pmap references.


# 1.98 26-Mar-1997 gwr

Move findroot/setroot stuff from configure() to cpu_rootconf().


Revision tags: is-newarp-before-merge is-newarp-base
# 1.97 02-Feb-1997 thorpej

Add missing \n in printf format for "cannot mount root" error message.
Pointed out by cgd@netbsd.org


# 1.96 31-Jan-1997 cgd

fix check_console() changes:
* prototype it before it is used (several ports compile with
-Wstrict-prototypes -Wmissing-prototypes), so this is _necessary_.
* conform to C syntax (yes, that's right, it wouldn't parse).
* make error check less error-prone, + style fixups.


# 1.95 31-Jan-1997 thorpej

- NFSCLIENT -> NFS
- Run mountroot hooks before we attempt to mount the root device, and
destroy mountroot hooks after the root file system has been sucessfully
mounted.
- Don't panic if we can't mount root. Instead, set RB_ASKNAME and
call setroot(), which will prompt the operator for the root device
and file system type.


# 1.94 31-Jan-1997 mouse

Oops, forgot the #include.


# 1.93 31-Jan-1997 mouse

Apply the interim fix from PR 2236, reformatted and a comment added.
Not a real fix, but it should help until we get a real fix done.


# 1.92 22-Dec-1996 cgd

branches: 1.92.2;
* catch up with system call argument type fixups/const poisoning.
* Fix arguments to various copyin()/copyout() invocations, to avoid
gratuitous casts.
* Some KNF formatting fixes


# 1.91 03-Dec-1996 thorpej

Make NFSSERVER work without NFSCLIENT. This is achieved by splitting
the client and server/shared data initialization into separate functions,
and calling the server/shared initialization directly from main().
Problem noted in PR #1308 (Kenneth Stailey) and PR #1780 (Chris Demetriou).
Fix suggested in PR #1780 by Chris Demetriou, and munged a bit by me,
and OK'd by Frank van der Linden <fvdl@netbsd.org>.


# 1.90 13-Oct-1996 christos

backout previous kprintf change


# 1.89 10-Oct-1996 christos

printf -> kprintf, sprintf -> ksprintf


# 1.88 10-Oct-1996 thorpej

Fix botch in netbsd-1-2 merge (multiple inclusion of <sys/tty.h>),
pointed out by Jonathan Stone <jonathan@DSG.Standford.EDU>.


# 1.87 09-Oct-1996 thorpej

Merge the netbsd-1-2 branch back into the mainline.


# 1.86 05-Oct-1996 scottr

Expand tab in copyright message; it loses on some consoles.


# 1.85 29-May-1996 mrg

call tty_init().


Revision tags: netbsd-1-2-base
# 1.84 22-Apr-1996 christos

branches: 1.84.4;
remove include of <sys/cpu.h>


# 1.83 04-Apr-1996 cgd

call config_init() before autoconfiguration, to initialize alldevs and
allevents lists.


# 1.82 09-Feb-1996 christos

More proto fixes


# 1.81 04-Feb-1996 christos

First pass at prototyping


# 1.80 07-Jan-1996 thorpej

New generic disk framework. Highlights:

- New metrics handling. Metrics are now kept in the new
`struct disk'. Busy time is now stored as a timeval, and
transfer count in bytes.

- Storage for disklabels is now dynamically allocated, so that
the size of the disk structure is not machine-dependent.

- Several new functions for attaching and detaching disks, and
handling metrics calculation.

Old-style instrumentation is still supported in drivers that did it before.
However, old-style instrumentation is being deprecated, and will go away
once the userland utilities are updated for the new framework.

For usage and architectural details, see the forthcoming disk(9) manual
page.


# 1.79 09-Dec-1995 mycroft

Eliminate an extra variable.


Revision tags: netbsd-1-1-PATCH001 netbsd-1-1-RELEASE netbsd-1-1-base
# 1.78 07-Oct-1995 mycroft

Prefix names of system call implementation functions with `sys_'.


# 1.77 22-Apr-1995 christos

- new copyargs routine.
- use emul_xxx
- deprecate nsysent; use constant SYS_MAXSYSCALL instead.
- deprecate ep_setup
- call sendsig and setregs indirectly.


# 1.76 25-Mar-1995 cgd

make it reasonable for processes to not double-map it's user area and kstack


# 1.75 19-Mar-1995 mycroft

Nuke startinit_verbose.


# 1.74 18-Jan-1995 mycroft

Turn mountlist into a CIRCLEQ, and handle setting and checking of MNT_ROOTFS
differently.


# 1.73 12-Jan-1995 cgd

cast pointers to longs.


# 1.72 24-Dec-1994 cgd

make return type explicit, from James Jegers


# 1.71 19-Dec-1994 cgd

use ALIGNBYTES for calculating alignment. no reason not to, and good style
to do so.


# 1.70 03-Nov-1994 deraadt

you cannot ALIGN() backwards


# 1.69 28-Oct-1994 cgd

kill space.


# 1.68 20-Oct-1994 cgd

update for new syscall args description mechanism


# 1.67 18-Oct-1994 cgd

DEBUG and/or DIAGNOSTIC shouldn't cause thing to be printed for "normal"
cases, unless the user explicitly requests it. add variable
startinit_verbose to control init-starting messages.


# 1.66 11-Oct-1994 mycroft

Avoid GCC generating a call to memset().


# 1.65 22-Sep-1994 mycroft

Maintain vfs reference counts.


# 1.64 10-Sep-1994 mycroft

Nuke the silly `--' hack when there are no flags.


# 1.63 30-Aug-1994 mycroft

Convert process, file, and namei lists and hash tables to use queue.h.


# 1.62 17-Jul-1994 cgd

fix RCS ID. *sigh*


Revision tags: netbsd-1-0-base
# 1.61 03-Jul-1994 mycroft

branches: 1.61.2;
No more HP copyright.


# 1.60 03-Jul-1994 cgd

kill a relic


# 1.59 30-Jun-1994 cgd

fix warning


# 1.58 30-Jun-1994 cgd

fix some lossage


# 1.57 29-Jun-1994 cgd

New RCS ID's, take two. they're more aesthecially pleasant, and use 'NetBSD'


# 1.56 08-Jun-1994 mycroft

Update to 4.4-Lite fs code.


# 1.55 03-Jun-1994 cgd

kill old init-starting code


# 1.54 31-May-1994 phil

pc532 now does new init process


# 1.53 29-May-1994 gwr

Now the sun3 stars init the new way.


# 1.52 27-May-1994 deraadt

ufs/ufs/quote.h? no.. not yet..


# 1.51 27-May-1994 mycroft

hp300 port is blessed.


# 1.50 27-May-1994 mycroft

The i386 port is now blessed.


# 1.49 27-May-1994 chopps

amiga now included in list of new init bootstrap users


# 1.48 27-May-1994 mycroft

Fix thinko in last change.


# 1.47 27-May-1994 mycroft

Get the arguments to vm_allocate() right in new init code.


# 1.46 27-May-1994 glass

pmax and sparc take the 4.4-lite path


# 1.45 21-May-1994 cgd

add latent stupport for new way of starting init


# 1.44 19-May-1994 cgd

kill a notdef


# 1.43 18-May-1994 cgd

mostly-machine-indepedent switch, and changes to match. also, hack init_main


# 1.42 16-May-1994 cgd

kill uname-related crap


# 1.41 13-May-1994 cgd

setrq -> setrunqueue, sched -> scheduler


# 1.40 05-May-1994 cgd

lots of changes: prototype migration, move lots of variables, definitions,
and structure elements around. kill some unnecessary type and macro
definitions. standardize clock handling. More changes than you'd want.


# 1.39 04-May-1994 cgd

Rename a lot of process flags.


# 1.38 18-Mar-1994 mycroft

Clean up uname(2) code some more.


# 1.37 13-Feb-1994 mycroft

KNFify uname code.


# 1.36 26-Jan-1994 mw

amiga wants RTC started early, too (like i386 and mac)


# 1.35 14-Jan-1994 deraadt

`extern int cpu' isn't used at all.


# 1.34 09-Jan-1994 briggs

Ugh. Missed the other. mac=>mac68k...


# 1.33 09-Jan-1994 briggs

mac => mac68k


# 1.32 08-Jan-1994 mycroft

#include vm_user.h.


# 1.31 18-Dec-1993 mycroft

Canonicalize all #includes.


# 1.30 23-Nov-1993 deraadt

initialize pseudo devices with pdevinit[], not with a bunch of
#include/#ifdef pairs..


# 1.29 14-Nov-1993 cgd

Add the System V message queue and semaphore facilities. Implemented
by Daniel Boulet <danny@BouletFermat.ab.ca>


# 1.28 15-Oct-1993 cgd

get rid of __main() -- it's going into libkern


Revision tags: magnum-base
# 1.27 15-Sep-1993 cgd

make allproc be volatile, and cast things accordingly.
suggested by torek, because CSRG had problems with reordering
of assignments to allproc leading to strange panics from kernels
compiled with gcc2...


# 1.26 12-Sep-1993 glass

check return codes on copyout()s, panic if they fail.


# 1.25 31-Aug-1993 deraadt

branches: 1.25.2;
fixed a little /lib/cpp boo-boo


# 1.24 29-Aug-1993 cgd

print more DIAGNOSITC info, and startrtclock early on the mac (like i386)


# 1.23 23-Aug-1993 mycroft

RLIMIT_OFILE --> RLIMIT_NOFILE


# 1.22 14-Aug-1993 deraadt

ppp from paul mackerras


# 1.21 07-Aug-1993 cgd

do the Net/2 thing with startrtclock() for non-i386 architectures.
i386's startrtclock should be moved down, as well, but i believe it
does some magic...


# 1.20 28-Jul-1993 cgd

incorporate changes from 0-9-base to 0-9-ALPHA


Revision tags: netbsd-0-9-base
# 1.19 18-Jul-1993 andrew

branches: 1.19.2;
* don't used copyout() to relocate icode - use bcopy() instead


# 1.18 10-Jul-1993 cgd

handle the initflags problem in a simple (if twisted) way.
also, remind the pagedaemon that it's a daemon, not an r... 8-)


# 1.17 10-Jul-1993 mycroft

Change the names of processes 0 and 2.


# 1.16 27-Jun-1993 andrew

ANSIfications - removed all implicit function return types and argument
definitions. Ensured that all files include "systm.h" to gain access to
general prototypes. Casts where necessary.


# 1.15 21-Jun-1993 deraadt

> NetBSD 0.8a (TDR) #2: Mon Jun 21 11:06:03 MDT 1993
produces "uname -v" output "TDR#2"
"uname -a" then is..
> NetBSD gecko 0.8a TDR#2 i386


# 1.14 18-Jun-1993 brezak

Find version number for uname.


# 1.13 20-May-1993 cgd

hack on the uname "machine name" stuff for hopefully the last time.
now it uses MACHINE, as defined in param.h


# 1.12 20-May-1993 cgd

add $Id$ strings, and clean up file headers where necessary


# 1.11 20-May-1993 cgd

make uname stuff in init_main machine independent


# 1.10 07-May-1993 cgd

fix uname initialization


# 1.9 06-May-1993 cgd

diffs for uname (posix!) system call, provided by John Brezak <brezak@osf.org>


# 1.8 28-Apr-1993 mycroft

Give processes 0 and 2 more appropriate names (`scheduler' and `swapper', respectively).


Revision tags: netbsd-0-8 netbsd-alpha-1
# 1.7 10-Apr-1993 cgd

version's not supposed to be printed here; it's supposed to be printed
in machdep.c


# 1.6 06-Apr-1993 cgd

changed order of copyright/version notice (to match 4.4 boot string)...


# 1.5 03-Apr-1993 cgd

got rid of accidental extra newline


# 1.4 03-Apr-1993 cgd

added changes from Steven Reiz <sreiz@aie.nl> (based on
those by Poul-Henning Kamp <phk@data.fls.dk>) to get the kernel
to compile properly when gcc2.* is cc. (should still work
when gcc1.39 is in use.)


# 1.3 03-Apr-1993 cgd

now just prints out version. also, got rid of kernel_version,
and fixed wfj's trampling on UCB copyright notices.


# 1.2 23-Mar-1993 cgd

got rid of hightlighted test, and changed copyright/kernel version
string delcarations


# 1.1 21-Mar-1993 cgd

branches: 1.1.1;
Initial revision