History log of /freebsd-10.0-release/sys/kern/kern_timeout.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 259065 07-Dec-2013 gjb

- Copy stable/10 (r259064) to releng/10.0 as part of the
10.0-RELEASE cycle.
- Update __FreeBSD_version [1]
- Set branch name to -RC1

[1] 10.0-CURRENT __FreeBSD_version value ended at '55', so
start releng/10.0 at '100' so the branch is started with
a value ending in zero.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation

# 256281 10-Oct-2013 gjb

Copy head (r256279) to stable/10 as part of the 10.0-RELEASE cycle.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


# 255877 26-Sep-2013 davide

Make the callout arithmetic more robust adding checks for overflow.
Without these, if the timeout value passed is "large enough", the
value of the sum of it and other factors (e.g. current time as
returned by sbinuptime() or 'precision' argument) might result in a
negative number. This negative number is then passed to
eventtimers(4), which causes et_start() routine to load et_min_period
into eventtimer, making the CPU where the thread is stuck forever in
timer interrupt handler routine. This is now avoided rounding to
INT64_MAX the timeout period in case of overflow.

Reported by: kib, pho
Discussed with: kib, mav
Tested by: pho (stress2 suite, kevent7.sh scenario)
Approved by: re (kib)


# 255747 20-Sep-2013 davide

Fix callout_init_rm() in the shared case, allocating storage for 'struct
rm_priotracker' directly in the softclock thread. Now consumers can
pass CALLOUT_SHAREDLOCK flag to callout initialization routine safely.
The choice of the already existing flags instead of special casing
shared rmlocks is done to prevent consumer footshooting.

Suggested by: jhb
Reviewed by: jhb
Approved by: re (delphij)


# 254350 15-Aug-2013 markj

Specify SDT probe argument types in the probe definition itself rather than
using SDT_PROBE_ARGTYPE(). This will make it easy to extend the SDT(9) API
to allow probes with dynamically-translated types.

There is no functional change.

MFC after: 2 weeks


# 248699 25-Mar-2013 davide

Cache the callout precision argument as part of the informations required
for migrating callouts to new CPU. This value is passed to
callout_cc_add() in order to update properly precision field in case of
rescheduling/migration.

Reviewed by: mav


# 248141 10-Mar-2013 andre

Bring back the comment on the sizing of the callout array that got
lost in r248031.

Requested by: alc, alfred


# 248113 09-Mar-2013 davide

Fixup r248032:
Change size requested to malloc(9) now that callwheel buckets are
callout_list and not callout_tailq anymore. This change was already
there but it seems it got lost after code churn in r248032.

Reported by: alc, kib


# 248032 08-Mar-2013 andre

Move the callout subsystem initialization to its own SYSINIT()
from being indirectly called via cpu_startup()+vm_ksubmap_init().
The boot order position remains the same at SI_SUB_CPU.

Allocation of the callout array is changed to stardard kernel malloc
from a slightly obscure direct kernel_map allocation.

kern_timeout_callwheel_alloc() is renamed to callout_callwheel_init()
to better describe its purpose.
kern_timeout_callwheel_init() is removed simplifying the per-cpu
initialization.

Reviewed by: davide


# 248031 08-Mar-2013 andre

Move the auto-sizing of the callout array from init_param2() to
kern_timeout_callwheel_alloc() where it is actually used.

This is a mechanical move and no tuning parameters are changed.

The pre-allocated callout array is only used for legacy timeout(9)
calls and is only allocated and active on cpu0. Eventually all
remaining users of timeout(9) should switch to the callout_* API.

Reviewed by: davide


# 247818 04-Mar-2013 davide

Complete r247813:
Use true/false instead of TRUE/FALSE.

Reported by: attilio
Requested by: jhb


# 247813 04-Mar-2013 davide

Use C99 'bool' rather than Machish 'boolean_t'.

Requested by: jhb


# 247793 04-Mar-2013 davide

Fix build with DIAGNOSTIC/CALLOUT_PROFILING options turned on.

Reported by: kib, David Wolfskill <david at catwhisker dot org>
Pointy-hat to: davide


# 247777 04-Mar-2013 davide

- Make callout(9) tickless, relying on eventtimers(4) as backend for
precise time event generation. This greatly improves granularity of
callouts which are not anymore constrained to wait next tick to be
scheduled.
- Extend the callout KPI introducing a set of callout_reset_sbt* functions,
which take a sbintime_t as timeout argument. The new KPI also offers a
way for consumers to specify precision tolerance they allow, so that
callout can coalesce events and reduce number of interrupts as well as
potentially avoid scheduling a SWI thread.
- Introduce support for dispatching callouts directly from hardware
interrupt context, specifying an additional flag. This feature should be
used carefully, as long as interrupt context has some limitations
(e.g. no sleeping locks can be held).
- Enhance mechanisms to gather informations about callwheel, introducing
a new sysctl to obtain stats.

This change breaks the KBI. struct callout fields has been changed, in
particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t'
(8 bytes) and another 'sbintime_t' field was added for precision.

Together with: mav
Reviewed by: attilio, bde, luigi, phk
Sponsored by: Google Summer of Code 2012, iXsystems inc.
Tested by: flo (amd64, sparc64), marius (sparc64), ian (arm),
markj (amd64), mav, Fabian Keil


# 247715 03-Mar-2013 davide

callwheelmask and callwheelsize are always greater than zero.
Switch their type to u_int.


# 247714 03-Mar-2013 davide

Remove a couple of unused include.


# 247698 03-Mar-2013 mav

MFcalloutng:
Some whitespace fixes.


# 247467 28-Feb-2013 davide

MFcalloutng:
Style fixes.


# 243912 05-Dec-2012 attilio

Fixup r243901:
- As the comment report, CALLOUT_LOCAL_ALLOC cannot be checked
directly from the callout flags but might be checked by a cached
value. Hence, do so before to actually remove the callout, when
needed, in softclock_call_cc().
- In softclock_call_cc() also add a comment in the waiting and deferred
migration case explaining that the dereference should be safe
because of the migration dereference invariants.

Additively:
- In softclock_call_cc(), for the deferred migration case, move all the
accesses to callout structure after the comment stating the callout
must not be destroyed.
- For consistency with this last tweak, use cached c_flags for the
KASSERT() in the deferred migration case. It is not strictly necessary
but this way all the callout accesses happen after the above mentioned
comment, improving consistency.

Pointy hat to: me
Sponsored by: Isilon Systems / EMC Corporation
Reviewed by: kib
MFC after: 2 weeks
X-MFC: 243901


# 243901 05-Dec-2012 kib

The softclock_call_cc() is executing with the callout already removed
from the callwheel. Calculate the cc->cc_next before removing the
callout, otherwise the code followed the invalid tailq links. After
this, make softclock_call_cc() return void, since it always return
cc->cc_next, which is immediately available to the softclock()
anyway. This also allows to eliminate a label under #ifdef SMP.

Remove the assignment of cc->cc_next from callout_cc_del(), since the
function is called with the callout already removed from callwheel.

If cancelling the migration, also clear the CALLOUT_DFRMIGRATION flag.

Postpone the free of the timeout(9) allocated callouts after the
migration checks are done.

Add some more strict asserts about the state of the callout in
callout_call_cc().

Reviewed by: attilio
Reported and tested by: pho (previous version)
MFC after: 2 weeks


# 243853 04-Dec-2012 alfred

replace bit shifting loop with 1<<fls(n), improve comments.

Reviewed by: davide


# 242402 31-Oct-2012 attilio

Rework the known mutexes to benefit about staying on their own
cache line in order to avoid manual frobbing but using
struct mtx_padalign.

The sole exception being nvme and sxfge drivers, where the author
redefined CACHE_LINE_SIZE manually, so they need to be analyzed and
dealt with separately.

Reviwed by: jimharris, alc


# 242401 31-Oct-2012 jimharris

Pad and align the callout_cpu mtx to its own cacheline to reduce false
sharing especially on the default CPU 0 callout_cpu structure.

This will be followed up by attilio@ with a conversion to the new struct
mtx_padalign but doing this manual conversion first gives an easy MFC
candidate since mtx_padalign is a more extensive system change.

Sponsored by: Intel
Reviewed by: jeff, attilio
MFC after: 1 week


# 234981 03-May-2012 kib

Move the code to call the callout callback into the helper function
softclock_call_cc(). While there, move some common code to callout_cc_del().

Requested by: avg, jhb
Reviewed by: jhb
MFC after: 1 week


# 234952 03-May-2012 kib

When callout_reset_on() cannot immediately migrate a callout since it
is running on other cpu, the CALLOUT_PENDING flag is temporarily
cleared. Then, callout_stop() on this, in fact active, callout fails
because CALLOUT_PENDING is not set, and callout_stop() returns 0.

Now, in sleepq_check_timeout(), the failed callout_stop() causes the
sleepq code to execute mi_switch() without even setting the wmesg,
since the switch-out is supposed to be transient. In fact, the thread
is put off the CPU for full timeout interval, instead of being put on
runq immediately. Until timeout fires, the process is unkillable for
obvious reasons.

Fix this by marking the migrating callouts with CALLOUT_DFRMIGRATION
flag. The flag is cleared by callout_stop_safe() when the function
detects a migration, besides returning the success. The softclock()
rechecks the flag for migrating callout and cancels its execution if
the flag was cleared meantime.

PR: misc/166340
Reported, debugging traces provided and tested by:
Christian Esken <christian.esken trivago com>
Reviewed by: avg, jhb
MFC after: 1 week


# 227293 07-Nov-2011 ed

Mark MALLOC_DEFINEs static that have no corresponding MALLOC_DECLAREs.

This means that their use is restricted to a single C file.


# 225057 21-Aug-2011 attilio

callout_cpu_switch() allows preemption when dropping the outcoming
callout cpu lock (and after having dropped it).
If the newly scheduled thread wants to acquire the old queue it will
just spin forever.

Fix this by disabling preemption and interrupts entirely (because fast
interrupt handlers may incur in the same problem too) while switching
locks.

Reported by: hrs, Mike Tancsa <mike AT sentex DOT net>,
Chip Camden <sterling AT camdensoftware DOT com>
Tested by: hrs, Mike Tancsa <mike AT sentex DOT net>,
Chip Camden <sterling AT camdensoftware DOT com>,
Nicholas Esborn <nick AT desert DOT net>
Approved by: re (kib)
MFC after: 10 days


# 220456 08-Apr-2011 attilio

Reintroduce the fix already discussed in r216805 (please check its history
for a detailed explanation of the problems).

The only difference with the previous fix is in Solution2:
CPUBLOCK is no longer set when exiting from callout_reset_*() functions,
which avoid the deadlock (leading to r217161).
There is no need to CPUBLOCK there because the running-and-migrating
assumption is strong enough to avoid problems there.
Furthermore add a better !SMP compliancy (leading to shrinked code and
structures) and facility macros/functions.

Tested by: gianni, pho, dim
MFC after: 3 weeks


# 217161 08-Jan-2011 attilio

Revert r216805.
That revision is introducing a bug which is more visible than problems
it is trying to fix.

As long as my time is very limited in this period I am going to
commit back this patch just once it is fully fixed.

Reported by: dim, Nicholas Esborn


# 216805 29-Dec-2010 attilio

Fix several callout migration races:
- Problem1:
Hypothesis: thread1 is doing a callout_reset_on(), within his
callout handler, willing to implicitly or explicitly migrate the
callout. thread2 is draining the callout.

Thesys:
* thread1 calls callout_lock() and locks the old callout cpu
* thread1 performs the checks in the first path of the
callout_reset_on()
* thread1 hits this codepiece:
/*
* If the lock must migrate we have to check the state again as
* we can't hold both the new and old locks simultaneously.
*/
if (c->c_cpu != cpu) {
c->c_cpu = cpu;
CC_UNLOCK(cc);
goto retry;
}

which means it will drop the lock and 'retry'
* thread2 will callout_lock() and locks the new callout cpu.
thread1 spins on the new lock and will not keep going for the
moment.
* thread2 checks that the callout is not pending (as callout is
currently running) and that it is not on cc->cc_curr (because cc
now refers to the new callout and the callout is running on the
old callout cpu) thus it thinks it is done and returns.
* thread1 will now acquire the lock and then adds the callout
to the new callout cpu queue

That seems an obvious race as callout_stop() falsely reports
the callout stopped or worse, callout_drain() falsely returns
while the callout is still in use.
- Solution1:
Fixing this problem would require, in general, to lock both
callout cpus at once while switching the c_cpu field and avoid
cyclic deadlocks between callout cpus locks.
The concept of CPUBLOCK is then introduced (working more or less
like the blocked_lock for thread_lock() function) meaning:
"in callout_lock(), spin until the c->c_cpu is not different from
CPUBLOCK". That way the "original" callout cpu, referred to the
above mentioned code snippet, will remain blocked until the lock
handover is over critical path will remain covered.

- Problem2:
Having the callout currently executed on a specific callout cpu
and contemporary pending on another callout cpu (as it can happen
with current code) breaks, at least, the assumption callout_drain()
returns just once the callout cannot be referenced anymore.
- Solution2:
Callout migration is deferred if the current callout is already
under execution.
The best place to do that is in softclock() and new members are
added to the callout cpu structure in order to specify a pending
migration is requested. That is necessary because the callout
cannot be trusted (not freed) the 100% of times after the execution
of the callout handler.
CPUBLOCK will prevent, in the "deferred migration" case, that the
callout gets freed in this case, stopping any callout_stop() and
callout_drain() possible activity until the migration is
actually performed.

- Problem3:
There is a further race in callout_drain().
In order to avoid a race between sleepqueue lock and callout cpu
spinlock, in _callout_stop_safe(), the callout cpu lock is dropped,
the sleepqueue lock is acquired and a new callout cpu lookup is
performed. Note that the channel used for locking the sleepqueue is
obtained from the "current" callout cpu (&cc->cc_waiting).
If the callout migrated in the meanwhile, callout_drain() will end up
using the wrong wchan for the sleepqueue (the locked one will be the
older, while the new one will not really be locked) leading to a
lock leak and a race access to sleepqueue.
- Solution3:
It is enough to check if a migration happened between the operation
of acquiring the sleepqueue lock and the new callout cpu lock and
eventually unwind all those and try again.

This problems can lead to deathly races on moderate (4-ways) SMP
environment, leading to easy panic or deadlocks.
The 24-ways of the reporter, could easilly panic, with completely
normal workload, almost daily.
gianni@ kindly wrote the following prof-of-concept which can
panic a FreeBSD machine in less than one hour, in smaller SMP:
http://www.freebsd.org/~attilio/callout/test.c

Reported by: Nicholas Esborn <nick at desert dot net>, DesertNet
In collabouration with: gianni, pho, Nicholas Esborn
Reviewed by: jhb
MFC after: 1 week (*)

* Usually, I would aim for a larger MFC timeout, but I really want this
in before 8.2-RELEASE, thus re@ accepted a shorter timeout as a special
case for this patch


# 214746 03-Nov-2010 jhb

Remove 'softclock_ih' as it is no longer used.


# 214597 31-Oct-2010 mav

Fix callout_tickstofirst() behavior after signed integer ticks overflow.
This should fix callout precision drop to 1/4s after 25 days of uptime
with HZ = 1000.

Submitted by: Taku YAMAMOTO <taku@tackymt.homeip.net>


# 212604 14-Sep-2010 mav

Fix panic on NULL dereference possible after r212541.


# 212603 14-Sep-2010 mav

Make kern_tc.c provide minimum frequency of tc_ticktock() calls, required
to handle current timecounter wraps. Make kern_clocksource.c to honor that
requirement, scheduling sleeps on first CPU for no more then specified
period. Allow other CPUs to sleep up to 1/4 second (for any case).


# 212541 13-Sep-2010 mav

Refactor timer management code with priority to one-shot operation mode.
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.

There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.

As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.

Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.


# 211616 22-Aug-2010 rpaulo

Add an extra comment to the SDT probes definition. This allows us to get
use '-' in probe names, matching the probe names in Solaris.[1]

Add userland SDT probes definitions to sys/sdt.h.

Sponsored by: The FreeBSD Foundation
Discussed with: rwaston [1]


# 209059 11-Jun-2010 jhb

Update several places that iterate over CPUs to use CPU_FOREACH().


# 200510 14-Dec-2009 luigi

Properly fix callout handling by putting all the per-cpu info in
struct callout_cpu. From the comment in the file:

+ * There is one struct callout_cpu per cpu, holding all relevant
+ * state for the callout processing thread on the individual CPU.
+ * In particular:
+ * cc_ticks is incremented once per tick in callout_cpu().
+ * It tracks the global 'ticks' but in a way that the individual
+ * threads should not worry about races in the order in which
+ * hardclock() and hardclock_cpu() run on the various CPUs.
+ * cc_softclock is advanced in callout_cpu() to point to the
+ * first entry in cc_callwheel that may need handling. In turn,
+ * a softclock() is scheduled so it can serve the various entries i
+ * such that cc_softclock <= i <= cc_ticks .

Together with a smaller patch committed in september, this fixes a
bug that affects 8.0 with apps that rely on callouts to fire exactly
in the number of ticks specified (qemu among them).
Right now, callouts in 8.0 fire one tick late.

This was discussed in september with JeffR and jhb

MFC after: 3 days


# 197137 12-Sep-2009 luigi

Make sure callouts are not processed one tick late.
The problem was introduced in SVN 180608/ rev 1.114 and affects
all users of callout_reset() (including select, usleep, setitimer).
A better fix probably involves replicating 'ticks' in the
struct callout_cpu; this commit is just a temporary thing so that
we can MFC it after a suitable test time and RE approval.

MFC after: 3 days


# 187664 24-Jan-2009 rwatson

Add explicit static DTrace tracing to the callout mechanism, capturing
pointers to the callout handler just before and just after the callout
it invoked. I attempted to do this in a manner congruent to tracing in
Solaris's callout mechanism, but couldn't quite use the same names due
to convention and syntax differences.

Example DTrace script to generate a distribution graph of callout
execution times:

callout_execute:::callout_start
{
self->cstart = timestamp;
}

callout_execute:::callout_end
{

@length = quantize(timestamp - self->cstart);
}

Reviewed by: jb
MFC after: 3 days


# 187150 13-Jan-2009 jhb

Add a new KTR tracepoint in the KTR_CALLOUT class to note when a callout
routine finishes executing.

MFC after: 1 week


# 184385 28-Oct-2008 peter

After a machine has been up for a bit more than 20 days with HZ=1000,
"ticks" goes negative. This breaks the signed comparison in softclock.
This causes sleep() to never wake up, tcp to stop, etc etc. This is
bad(TM). Use the SEQ_LT() method from tcp's sequence number comparisons.


# 181191 02-Aug-2008 sam

add callout_schedule; besides being useful it also improves
compatibility with other systems

Reviewed by: ed, battlez


# 180608 19-Jul-2008 jeff

Fix a race which could result in some timeout buckets being skipped.
- When a tick occurs on a cpu, iterate from cs_softticks until ticks.
The per-cpu tick processing happens asynchronously with the actual
adjustment of the 'ticks' variable. Sometimes the results may
be visible before the local call and sometimes after. Previously this
could cause a one tick window where we didn't evaluate the bucket.
- In softclock fetch curticks before incrementing cc_softticks so we
don't skip insertions which were made for the current time.

Sponsored by: Nokia


# 177949 06-Apr-2008 jeff

- Correct a major error introduced in the per-cpu timeout commit. Sleep
and wakeup require the same wait channel to function properly.

Found by: kris
Pointy hat: me


# 177859 02-Apr-2008 jeff

Implement per-cpu callout threads, wheels, and locks.

- Move callout thread creation from kern_intr.c to kern_timeout.c
- Call callout_tick() on every processor via hardclock_cpu() rather than
inspecting callout internal details in kern_clock.c.
- Remove callout implementation details from callout.h
- Package up all of the global variables into a per-cpu callout structure.
- Start one thread per-cpu. Threads are not strictly bound. They prefer
to execute on the native cpu but may migrate temporarily if interrupts
are starving callout processing.
- Run all callouts by default in the thread for cpu0 to maintain current
ordering and concurrency guarantees. Many consumers may not properly
handle concurrent execution.
- The new callout_reset_on() api allows specifying a particular cpu to
execute the callout on. This may migrate a callout to a new cpu.
callout_reset() schedules on the last assigned cpu while
callout_reset_curcpu() schedules on the current cpu.

Reviewed by: phk
Sponsored by: Nokia


# 177491 22-Mar-2008 alfred

Fix a race where timeout/untimeout could cause crashes for Giant locked
code.

The bug:

There exists a race condition for timeout/untimeout(9) due to the
way that the softclock thread dequeues timeouts.

The softclock thread sets the c_func and c_arg of the callout to
NULL while holding the callout lock but not Giant. It then drops
the callout lock and acquires Giant.

It is at this point where untimeout(9) on another cpu/thread could
be called.

Since c_arg and c_func are cleared, untimeout(9) does not touch the
callout and returns as if the callout is canceled.

The softclock then tries to acquire Giant and likely blocks due to
the other cpu/thread holding it.

The other cpu/thread then likely deallocates the backing store that
c_arg points to and finishes working and hence drops Giant.

Softclock resumes and acquires giant and calls the function with
the now free'd c_arg and we have corruption/crash.

The fix:

We need to track curr_callout even for timeout(9) (LOCAL_ALLOC)
callouts. We need to free the callout after the softclock processes
it to deal with the race here.

Obtained from: Juniper Networks, iedowse
Reviewed by: jhb, iedowse
MFC After: 2 weeks.


# 177085 12-Mar-2008 jeff

- Pass the priority argument from *sleep() into sleepq and down into
sched_sleep(). This removes extra thread_lock() acquisition and
allows the scheduler to decide what to do with the static boost.
- Change the priority arguments to cv_* to match sleepq/msleep/etc.
where 0 means no priority change. Catch -1 in cv_broadcastpri() and
convert it to 0 for now.
- Set a flag when sleeping in a way that is compatible with swapping
since direct priority comparisons are meaningless now.
- Add a sysctl to ule, kern.sched.static_boost, that defaults to on which
controls the boost behavior. Turning it off gives better performance
in some workloads but needs more investigation.
- While we're modifying sleepq, change signal and broadcast to both
return with the lock held as the lock was held on enter.

Reviewed by: jhb, peter


# 176013 05-Feb-2008 attilio

Really, no explicit checks against against lock_class_* object should be
done in consumers code: using locks properties is much more appropriate.
Fix current code doing these bogus checks.

Note: Really, callout are not usable by all !(LC_SPINLOCK | LC_SLEEPABLE)
primitives like rmlocks doesn't implement the generic lock layer
functions, but they can be equipped for this, so the check is still
valid.

Tested by: matteo, kris (earlier version)
Reviewed by: jhb


# 173842 22-Nov-2007 attilio

Cache the value of c_lock as it can change, in the struct,
while the global callout spinlock is not held, and can lead to PF#.

Reported by: dougb, Mark Atkinson <atkin901 at yahoo dot com>
Tested by: dougb
Diagnosed by: jhb


# 173760 19-Nov-2007 attilio

Add the function callout_init_rw() to callout facility in order to use
rwlocks in conjuction with callouts. The function does basically what
callout_init_mtx() alredy does with the difference of using a rwlock
as extra argument.
CALLOUT_SHAREDLOCK flag can be used, now, in order to acquire the lock only
in read mode when running the callout handler. It has no effects when used
in conjuction with mtx.

In order to implement this, underlying callout functions have been made
completely lock type-unaware, so accordingly with this, sysctl
debug.to_avg_mtxcalls is now changed in the generic
debug.to_avg_lockcalls.

Note: currently the allowed lock classes are mutexes and rwlocks because
callout handlers run in softclock swi, so they cannot sleep and they
cannot acquire sleepable locks like sx or lockmgr.

Requested by: kmacy, pjd, rwatson
Reviewed by: jhb


# 172184 15-Sep-2007 rwatson

Remove the definition and implementation of 'CALLOUT_NETGIANT', a now- (and
possibly always-) unused define.

Reported by: kmacy
Approved by: re (kensmith)


# 172025 31-Aug-2007 jhb

Close a race that snuck in with the recent changes to fix a LOR between
the callout_lock spin lock and the sleepqueue spin locks. In the fix,
callout_drain() has to drop the callout_lock so it can acquire the
sleepqueue lock. The state of the callout can change while the
callout_lock is held however (for example, it can be rescheduled via
callout_reset()). The previous code assumed that the only state change
that could happen is that the callout could finish executing. This change
alters callout_drain() to effectively restart and recheck everything
after it acquires the sleepqueue lock thus handling all the possible
states that the callout could be in after any changes while callout_lock
was dropped.

Approved by: re (kensmith)
Tested by: kris


# 171053 26-Jun-2007 attilio

Fix an old standing LOR between callout_lock and sleepqueues chain (which
could lead to a deadlock).
- sleepq_set_timeout acquires callout_lock (via callout_reset()) only
with sleepq chain lock held
- msleep_spin in _callout_stop_safe lock the sleepqueue chain with
callout_lock held

In order to solve this don't use msleep_spin in _callout_stop_safe() but
use directly sleepqueues as inline msleep_spin code. Rearrange the
wakeup path in order to have it consistent too.

Reported by: kris (via stress2 test suite)
Tested by: Timothy Redaelli <drizzt@gufi.org>
Reviewed by: jhb
Approved by: jeff (mentor)
Approved by: re


# 169480 11-May-2007 andre

Make the TCP timer callout obtain Giant if the network stack is marked
as non-mpsafe.

This change is to be removed when all protocols are mp-safe.


# 163246 11-Oct-2006 glebius

Improve ktr(4) logging for callout(9) subsystem. Log all inserts and
removals, including failures, into the callwheel.

XXX: Most of the CTR() macros are called with callout_lock spin mutex
held, thus won't be logged into file, if KTR_ALQ is used. Moving the
CTR() macros out from the spinlocked code would require copying of all
arguments. I'm too lazy to do this.


# 155957 23-Feb-2006 jhb

Use the recently added msleep_spin() function to simplify the
callout_drain() logic. We no longer need a separate non-spin mutex to
do sleep/wakeup with, instead we can now just use the one spin mutex to
manage all the callout functionality.


# 150188 15-Sep-2005 jhb

Oops, missed adding the required include.

Pointy hat to: jhb


# 150187 15-Sep-2005 jhb

Replace the dont_sleep_in_callout mutex hack (similar to g_x{up,down})
with the disallow sleeping facility.


# 149879 08-Sep-2005 glebius

Make callout_reset() return a non-zero value if a pending callout
was rescheduled. If there was no pending callout, then return 0.

Reviewed by: iedowse, cperciva


# 141674 10-Feb-2005 iedowse

When processing a timeout() callout and returning it to the free
list, set `curr_callout' to NULL. This ensures that we won't attempt
to cancel the current callout if the original callout structure
gets recycled while we wait to acquire Giant.

This is reported to fix an intermittent syscons problem that was
introduced by revision 1.96.


# 141428 07-Feb-2005 iedowse

Add a mechanism for associating a mutex with a callout when the
callout is first initialised, using a new function callout_init_mtx().
The callout system will acquire this mutex before calling the callout
function and release it on return.

In addition, the callout system uses the mutex to avoid most of the
complications and race conditions inherent in asynchronous timer
facilities, so mutex-protected callouts have much simpler semantics.
As long as the mutex is held when invoking callout_stop() or
callout_reset(), then these functions will guarantee that the callout
will be stopped, even if softclock() had already begun to process
the callout.

Existing Giant-locked callouts will automatically pick up the new
race-free semantics. This should close a number of race conditions
in the USB code and probably other areas of the kernel too.

There should be no change in behaviour for "MP-safe" callouts; these
still need to use the techniques mentioned in timeout(9) to avoid
race conditions.


# 140492 19-Jan-2005 cperciva

Make "c->c_func = NULL" conditional on CALLOUT_LOCAL_ALLOC in both
places where it occurs, not just one. :-)

Pointed out by: glebius
Pointy had to: cperciva


# 140489 19-Jan-2005 cperciva

Make "c->c_func = NULL" conditional on the CALLOUT_LOCAL_ALLOC flag,
i.e., only clear c->c_func if the callout c is being used via the old
timeout(9) interface.

Requested by: glebius


# 140487 19-Jan-2005 cperciva

Clarify the description of the callout_active() macro: It is cleared by
callout_stop, callout_drain, and callout_deactivate, but is not
automatically cleared when a callout returns.


# 139831 07-Jan-2005 cperciva

Adjust two of my comments to the new world order: Indent protection in
the first column is performed using /**, not /*-.


# 133229 06-Aug-2004 rwatson

Cut a KTR record whenever a callout is invoked. Mark whether it runs
with Giant or not, and include the function point so it can be looked
up against the kernel symbol table during trace analysis.


# 133190 06-Aug-2004 cperciva

When reseting a pending callout, perform the deregistration in
callout_reset rather than calling callout_stop. This results in a few
lines of code duplication, but it provides a significant performance
improvement because it avoids recursing on callout_lock.

Requested by: rwatson


# 128630 25-Apr-2004 hmp

The paper "Hashed Timers and Hierarchical Wheels: Data Structures for the
Efficient Implementation of a Timer Facility" was co-author'ed by T. Lauk,
not A. Lauk.

Adjust nearby whitespace.


# 128485 20-Apr-2004 cperciva

1. Remove callout_stop binary compatibility.
2. Document that this means that kernel modules must be rebuilt.
3. While I'm here, fix my sorting error in callout.h

Requested by: many [1], scottl [2], bde [3]


# 128024 08-Apr-2004 cperciva

Add whitespace before comment blocks. (reported by njl)
Remove spurious whitespace, add indent protection, fix punctuation,
remove initialization of static variables to zero, put wakeup_ctr
and wakeup_needed in the correct order. (reported by bde)

This doesn't fix all the style bugs I introduced, but the remaining
style bugs make it easier for me to understand what I did here.


# 127969 06-Apr-2004 cperciva

Introduce a callout_drain() function. This acts in the same manner as
callout_stop(), except that if the callout being stopped is currently
in progress, it blocks attempts to reset the callout and waits until the
callout is completed before it returns.

This makes it possible to clean up callout-using code safely, e.g.,
without potentially freeing memory which is still being used by a callout.

Reviewed by: mux, gallatin, rwatson, jhb


# 127911 05-Apr-2004 imp

Remove advertising clause from University of California Regent's license,
per letter dated July 22, 1999.

Approved by: core


# 123254 07-Dec-2003 phk

Make the DIAGNOSTIC code which complains about long {call|time}out(9)
functions less noisy: We printf if a new function took longer than
the previous record holder, or of the previous record holder took
more than twice as long as the current record.


# 122761 15-Nov-2003 phk

Rename the debugging mutex "callout_no_sleep" to "dont_sleep_in_callout".


# 122585 12-Nov-2003 mckusick

At the request of several developers, restore the DIAGNOSIC code
deleted in 1.81. Increase the initial timeout limit to 2ms to
eliminate spurious messages of excessive timeouts in the NFS
client code.

Requested by: Poul-Henning Kamp <phk@phk.freebsd.dk>
Requested by: Mike Silbersack <silby@silby.com>
Requested by: Sam Leffler <sam@errno.com>


# 122039 04-Nov-2003 mckusick

Get rid of DIAGNOSTIC that gives false positives on slow CPUs.


# 119358 23-Aug-2003 marcel

On ia64 time_t is 64 bit. Explicitly cast tv_sec to long and change
the corresponding format specifier to %ld in a call to printf() in
function softclock(). The printf() is conditional upon DIAGNOSTIC.

Found by: LINT


# 116606 20-Jun-2003 phk

Don't put callout_lock under #ifdef DIAGNOSTIC despite the fact that it
works anyway.


# 116602 20-Jun-2003 phk

Crude but efficient:

#ifdef DIAGNOSTIC hold a mutex while calling callout's so that we hear
about it if they sleep.


# 116182 10-Jun-2003 obrien

Use __FBSDID().


# 115810 04-Jun-2003 phk

Add instrumentation which tells us how much work softclock() does
per invocation.


# 110185 01-Feb-2003 phk

Under DIAGNOSTIC, only report expensive timeouts if they are more expensive
than the last on we reported.


# 102961 05-Sep-2002 phk

Fix a format buglet.

Spotted by: iedowse


# 102936 04-Sep-2002 phk

Under DIAGNOSTIC, complain if a timeout(9) routine took more than 1msec.


# 93818 04-Apr-2002 jhb

Change callers of mtx_init() to pass in an appropriate lock type name. In
most cases NULL is passed, but in some cases such as network driver locks
(which use the MTX_NETWORK_LOCK macro) and UMA zone locks, a name is used.

Tested on: i386, alpha, sparc64


# 92723 19-Mar-2002 alfred

Remove __P.


# 82127 22-Aug-2001 dillon

Move most of the kernel submap initialization code, including the
timeout callwheel and buffer cache, out of the platform specific areas
and into the machine independant area. i386 and alpha adjusted here.
Other cpus can be fixed piecemeal.

Reviewed by: freebsd-smp, jake


# 81481 10-Aug-2001 jhb

Change callout_stop() to return an integer. If callout_stop() succeeds in
removing the callout entry, return 1. If callout_stop() fails to remove
the callout entry because it is currently executing or has already been
executed, then the function returns 0. The idea was obtained from BSD/OS,
however, BSD/OS changed untimeout(), and I've just changed callout_stop()
to be more conservative.

Obtained from: BSD/OS


# 81370 09-Aug-2001 jhb

Axe spl's obsoleted by the callout mutex.


# 74914 28-Mar-2001 jhb

Catch up to header include changes:
- <sys/mutex.h> now requires <sys/systm.h>
- <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>


# 72200 09-Feb-2001 bmilekic

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 69147 25-Nov-2000 jlemon

Revert the last commit to the callout interface, and add a flag to
callout_init() indicating whether the callout is safe or not. Update
the callers of callout_init() to reflect the new interface.

Okayed by: Jake


# 69136 25-Nov-2000 jake

- Rename callout_reset to _callout_reset and add a flags argument.
- Add macros callout_reset, which does the obvious, and
mp_callout_reset, which passes the CALLOUT_MPSAFE flag.


# 68889 19-Nov-2000 jake

- Protect the callout wheel with a separate spin mutex, callout_lock.
- Use the mutex in hardclock to ensure no races between it and
softclock.
- Make softclock be INTR_MPSAFE and provide a flag,
CALLOUT_MPSAFE, which specifies that a callout handler does not
need giant. There is still no way to set this flag when
regstering a callout.

Reviewed by: -smp@, jlemon


# 68869 17-Nov-2000 jhb

Release sched_lock very briefly to give interrupts a chance to fire if we
are in softclock() for a long time. The old code already did an
splx()/slphigh() pair here, I just missed adding in the equivalent mutex
operations on sched_lock earlier.


# 68840 16-Nov-2000 jhb

The recent changes to msleep() and mawait() resulted in timeout() and
untimeout() not being called with Giant in those functions. For now,
use the sched_lock to protect the callout wheel in softclock() and in
the various timeout and callout functions.

Noticed by: tegge


# 67551 25-Oct-2000 jhb

- Overhaul the software interrupt code to use interrupt threads for each
type of software interrupt. Roughly, what used to be a bit in spending
now maps to a swi thread. Each thread can have multiple handlers, just
like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
software interrupt thread to run, so spending, NSWI, and the shandlers
array are no longer needed. We can now have an arbitrary number of
software interrupt threads. When you register a software interrupt
thread via sinthand_add(), you get back a struct intrhand that you pass
to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
more intuitive. Also, prefix all the members of struct intrhand with
'ih_'.
- Make swi_net() a MI function since there is now no point in it being
MD.

Submitted by: cp


# 50673 30-Aug-1999 jlemon

Restructure TCP timeout handling:

- eliminate the fast/slow timeout lists for TCP and instead use a
callout entry for each timer.
- increase the TCP timer granularity to HZ
- implement "bad retransmit" recovery, as presented in
"On Estimating End-to-End Network Path Properties", by Allman and Paxson.

Submitted by: jlemon, wollmann


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 44527 06-Mar-1999 wollman

Fix callout_init(). This didn't have any practical effect since it
was only used to initialize the static timeouts, which unconditionally
clears the only bits which could have caused problems.


# 44510 06-Mar-1999 wollman

Expose a slightly-lower-level interface to timeouts which allows callers
to manage their own memory. Tested on my machine (make buildworld).
I've made analogous changes on the alpha, but don't have a machine
to test.

Not-objected-to by: dg, gibbs


# 36127 17-May-1998 bde

Fixed stale references to hzto() in comments.


# 33824 25-Feb-1998 bde

Declare function pointer args as pointers, not as functions.


# 33392 15-Feb-1998 phk

A bunch of nits from bde.


# 32512 14-Jan-1998 phk

Make softticks static.
Remove unneeded stuff.


# 32412 10-Jan-1998 phk

Fix softclock calling so we don't loose timeouts (I broke this ~10h ago)


# 32391 10-Jan-1998 phk

Whoops. softclock is called from doreti_swi as well. Abandon call from
hardclock().

Forgot this:

Pointed hat sent by: bd


# 32388 10-Jan-1998 phk

Effect the divorce of kern_clock.c and kern_timeout.c (which was
repository copied from kern_clock.c)


# 32323 07-Jan-1998 phk

Improve hardpps readability a bit:
* Rename usec to p_usec so you can search for it.
* Macroize the huge median_of_3_samples if statement.


# 31950 23-Dec-1997 nate

This patch causes the "calltodo" timer list to be decremented by the amount
of time that the laptop was suspending. Thus, select() calls that might have
suspended rather than firing at 1hr + "time suspended" since the timer was
posted.

Adding:

options APM_FIXUP_CALLTODO

to the kernel config enables the patch.

[
This patch was slightly modified to use a consistant indent style and
I removed some unused local variables. After this has been tested a
few weeks we'll make the options the default, so for now I'm now
documenting it in LINT. Mike can later if he wants.
]

Reviewed by: Mike Smith <msmith@freebsd.org>
Submitted by: Ken Key <key@cs.utk.edu>


# 31639 08-Dec-1997 fsmp

The improvements to clock statistics by Tor Egge
Wrappered and enabled by the define BETTER_CLOCK (on by default in smpyests.h)

Reviewed by: smp@csn.net
Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>


# 31393 24-Nov-1997 bde

Removed all traces of P_IDLEPROC. It was tested but never set.


# 31259 18-Nov-1997 bde

Removed unused #include.


# 31016 07-Nov-1997 phk

Remove a bunch of variables which were unused both in GENERIC and LINT.

Found by: -Wunused


# 29805 24-Sep-1997 gibbs

Store an absolute tick value in callout entries so that a subtraction on
hash chain traversal isn't needed. This also allows untimeout to recompute
the hash to find the bucket that the entry to remove is stored in so
that each callout entry no longer needs to store that information.

Reviewed by: Nate Williams <nate@mt.sri.com>


# 29680 21-Sep-1997 gibbs

init_main.c subr_autoconf.c:
Add support for "interrupt driven configuration hooks".
A component of the kernel can register a hook, most likely
during auto-configuration, and receive a callback once
interrupt services are available. This callback will occur before
the root and dump devices are configured, so the configuration
task can affect the selection of those two devices or complete
any tasks that need to be performed prior to launching init.
System boot is posponed so long as a hook is registered. The
hook owner is responsible for removing the hook once their task
is complete or the system boot can continue.

kern_acct.c kern_clock.c kern_exit.c kern_synch.c kern_time.c:
Change the interface and implementation for the kernel callout
service. The new implemntaion is based on the work of
Adam M. Costello and George Varghese, published in a technical
report entitled "Redesigning the BSD Callout and Timer Facilities".
The interface used in FreeBSD is a little different than the one
outlined in the paper. The new function prototypes are:

struct callout_handle timeout(void (*func)(void *),
void *arg, int ticks);

void untimeout(void (*func)(void *), void *arg,
struct callout_handle handle);

If a client wishes to remove a timeout, it must store the
callout_handle returned by timeout and pass it to untimeout.

The new implementation gives 0(1) insert and removal of callouts
making this interface scale well even for applications that
keep 100s of callouts outstanding.

See the updated timeout.9 man page for more details.


# 29179 07-Sep-1997 bde

Some staticized variables were still declared to be extern.


# 29041 02-Sep-1997 bde

Removed unused #includes.


# 28551 21-Aug-1997 bde

#include <machine/limits.h> explicitly in the few places that it is required.


# 26897 24-Jun-1997 jhay

Add tickadj to struct clockinfo, like NetBSD and OpenBSD.
NOTE: libc, time, kgmon and rpc.rstatd will have to be recompiled.


# 25164 26-Apr-1997 peter

Man the liferafts! Here comes the long awaited SMP -> -current merge!

There are various options documented in i386/conf/LINT, there is more to
come over the next few days.

The kernel should run pretty much "as before" without the options to
activate SMP mode.

There are a handful of known "loose ends" that need to be fixed, but
have been put off since the SMP kernel is in a moderately good condition
at the moment.

This commit is the result of the tinkering and testing over the last 14
months by many people. A special thanks to Steve Passe for implementing
the APIC code!


# 24117 22-Mar-1997 mpp

Restore Bruce's original comment. It seems that "iff" = if and only if,
and is not a typo. It is used other places in the kernel, too.


# 24109 22-Mar-1997 mpp

Fix a typo in a comment of a recent commit.


# 24101 22-Mar-1997 bde

Fixed some invalid (non-atomic) accesses to `time', mostly ones of the
form `tv = time'. Use a new function gettime(). The current version
just forces atomicicity without fixing precision or efficiency bugs.
Simplified some related valid accesses by using the central function.


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 22521 10-Feb-1997 dyson

This is the kernel Lite/2 commit. There are some requisite userland
changes, so don't expect to be able to run the kernel as-is (very well)
without the appropriate Lite/2 userland changes.

The system boots and can mount UFS filesystems.

Untested: ext2fs, msdosfs, NFS
Known problems: Incorrect Berkeley ID strings in some files.
Mount_std mounts will not work until the getfsent
library routine is changed.

Reviewed by: various people
Submitted by: Jeffery Hsu <hsu@freebsd.org>


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 21101 30-Dec-1996 jhay

Update our kernel ntp code to the latest from David Mills. The main change
is the addition of the FLL code, which is used by the latest versions of
xntpd. The kernel PPS code is also updated, although I can't test that yet.


# 19172 25-Oct-1996 bde

Improved biasing of i586 clock by adjusting for hardclock() latency.
I decided to do this for every hardclock() call instead of lazily
in microtime(). The lazy method is simpler but has more overhead
if microtime() is called a lot.

CPU_THISTICKLEN() is now a no-op and should probably go away.
Previously it did nothing directly but had the side effect of
setting i586_last_tick for CPU_CLOCKUPDATE() and i586_avg_tick for
debugging. CPU_CLOCKUPDATE() now uses a better method and
i586_avg_tick is too much trouble to maintain.

Reduced nesting of #includes in the usual case.

Increased nesting of #includes when CLOCK_HAIR is defined. This
is a kludge to get typedefs for inline functions only when the
inline functions are used. Normally only kern_clock.c defines
this. kern_clock.c can't include the i386 headers directly.

Removed unused LOCORE support.


# 18855 10-Oct-1996 bde

Don't include "opt_cpu.h" in <machine/clock.h>, since this breaks lkm's.
The change breaks kern_clock.c; fix that temporarily by including
"opt_cpu.h" there.


# 17342 30-Jul-1996 bde

Fixed resource usage integrals. They were too large by a factor of
of profhz/stathz when profiling was enabled.


# 16635 23-Jun-1996 bde

Unstaticize psratio and staticize profprocs. psratio needs to be exported
to trap.c to fix user profiling.


# 12913 17-Dec-1995 phk

Staticize.
Unstaticize a function in scsi/scsi_base that was used, with an undocumented
option.
My last count on the LINT kernel shows:
Total symbols: 3647
unref symbols: 463
undef symbols: 4
1 ref symbols: 1751
2 ref symbols: 485
Approaching the pain threshold now.


# 12662 07-Dec-1995 dg

Untangled the vm.h include file spaghetti.


# 12650 06-Dec-1995 phk

A couple of minor tweaks to the sysctl stuff.


# 12623 04-Dec-1995 phk

A major sweep over the sysctl stuff.

Move a lot of variables home to their own code (In good time before xmas :-)

Introduce the string descrition of format.

Add a couple more functions to poke into these marvels, while I try to
decide what the correct interface should look like.

Next is adding vars on the fly, and sysctl looking at them too.

Removed a tine bit of defunct and #ifdefed notused code in swapgeneric.


# 12569 02-Dec-1995 bde

Finished (?) cleaning up sysinit stuff.


# 12243 12-Nov-1995 phk

The entire sysctl callback to read/write version. I havn't tested this as
much as I'd like to, but the malloc stunt I tried for an interim for
sure does worse.
Now we can read and write from any kind of address-space, not only
user and kernel, using callbacks.
This may be over-generalization for now, but it's actually simpler.


# 12152 08-Nov-1995 phk

Fix some of the sysctl broke, and add a lot more to it.


# 11451 12-Oct-1995 wollman

Improve clock accuracy by accounting for late/missed clock interrupts
if the hardware supports it.


# 10653 09-Sep-1995 dg

Fixed init functions argument type - caddr_t -> void *. Fixed a couple of
compiler warnings.


# 10358 28-Aug-1995 julian

Reviewed by: julian with quick glances by bruce and others
Submitted by: terry (terry lambert)
This is a composite of 3 patch sets submitted by terry.
they are:
New low-level init code that supports loadbal modules better
some cleanups in the namei code to help terry in 16-bit character support
some changes to the mount-root code to make it a little more
modular..

NOTE: mounting root off cdrom or NFS MIGHT be broken as I haven't been able
to test those cases..

certainly mounting root of disk still works just fine..
mfs should work but is untested. (tomorrows task)

The low level init stuff includes a total rewrite of init_main.c
to make it possible for new modules to have an init phase by simply
adding an entry to a TEXT_SET (or is it DATA_SET) list. thus a new module can
be added to the kernel without editing any other files other than the
'files' file.


# 9759 29-Jul-1995 bde

Eliminate sloppy common-style declarations. There should be none left for
the LINT configuation.


# 8876 30-May-1995 rgrimes

Remove trailing whitespace.


# 7090 16-Mar-1995 bde

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 5081 12-Dec-1994 bde

Obtained from: my old fix for 1.1.5

Improve hzto():

Round up instead of down and then add 1 tick. This fixes sleep(1)
sometimes sleeping for < 1 second and usleep(10000) sometimes sleeping
for as little as 1 usec + syscall time.

Don't do all the calculations at splhigh().

Don't depend on `tick' being a multiple of 1000.

Don't lose accuracy for `sec' between 0x7fffffff / 1000 - 1000 and
0x7fffffff / hz.

Don't assume that longs are 32 bits or that ints have the same size as
longs.


# 3640 16-Oct-1994 wollman

kern_clock.c: define dk_names[][].
kern_sysctl.c: call dev_sysctl for hw.devconf mib subtree
kern_devconf.c: sysctl-accessible device-configuration and -management
interface


# 3308 02-Oct-1994 phk

All of this is cosmetic. prototypes, #includes, printfs and so on. Makes
GCC a lot more silent.


# 3183 28-Sep-1994 wollman

Fixed bug in hardclock() that caused adjtime() to fail when given
a negative offset. This would be seen in xntpd as a rash of
``Previous time adjustment didn't complete'' messages on startup.


# 3098 25-Sep-1994 phk

While in the real world, I had a bad case of being swapped out for a lot of
cycles. While waiting there I added a lot of the extra ()'s I have, (I have
never used LISP to any extent). So I compiled the kernel with -Wall and
shut up a lot of "suggest you add ()'s", removed a bunch of unused var's
and added a couple of declarations here and there. Having a lap-top is
highly recommended. My kernel still runs, yell at me if you kernel breaks.


# 2858 18-Sep-1994 wollman

Redo Kernel NTP PLL support, kernel side.

This code is mostly taken from the 1.1 port (which was in turn taken from
Dave Mills's kern.tar.Z example). A few significant differences:

1) ntp_gettime() is now a MIB variable rather than a system call. A few
fiddles are done in libc to make it behave the same.

2) mono_time does not participate in the PLL adjustments.

3) A new interface has been defined (in <machine/clock.h>) for doing
possibly machine-dependent things around the time of the clock update.
This is used in Pentium kernels to disable interrupts, set `time', and
reset the CPU cycle counter as quickly as possible to avoid jitter in
microtime(). Measurements show an apparent resolution of a bit more than
8.14usec, which is reasonable given system-call overhead.


# 2320 27-Aug-1994 dg

1) Changed ddb into a option rather than a pseudo-device (use options DDB
in your kernel config now).
2) Added ps ddb function from 1.1.5. Cleaned it up a bit and moved into its
own file.
3) Added \r handing in db_printf.
4) Added missing memory usage stats to statclock().
5) Added dummy function to pseudo_set so it will be emitted if there
are no other pseudo declarations.


# 2112 18-Aug-1994 wollman

Fix up some sloppy coding practices:

- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.


# 1817 02-Aug-1994 dg

Added $Id$


# 1549 25-May-1994 rgrimes

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


# 1541 24-May-1994 rgrimes

BSD 4.4 Lite Kernel Sources