History log of /freebsd-11-stable/sys/x86/x86/tsc.c
Revision Date Author Comments
(<<< Hide modified files)
(Show modified files >>>)
# 363433 22-Jul-2020 jkim

MFC: r362509

Assume all TSCs are synchronized for AMD Family 17h processors and later
when it has passed the synchronization test.


# 353007 02-Oct-2019 kib

MFC r352684:
x86: Fall back to leaf 0x16 if TSC frequency is obtained by CPUID and
leaf 0x15 is not functional.


# 335657 26-Jun-2018 avg

MFC r334204,r334338: re-synchronize TSC-s on SMP systems after resume


# 333161 02-May-2018 kib

MFC r333002:
Use CPUID leaf 0x15 to get TSC frequency when the calibration is
disabled.


# 328386 25-Jan-2018 pkelsey

MFC r316648:

Corrected misspelled versions of rendezvous.

The MFC maintains smp_no_rendevous_barrier() as a symbol alias of
smp_no_rendezvous_barrier().

__FreeBSD_version bumped to indicate presence of the new name
smp_no_rendezvous_barrier().

Reviewed by: gnn, jhb (email), kib
Differential Revision: https://reviews.freebsd.org/D10313


# 314999 10-Mar-2017 kib

MFC r314211:
Remove cpu_deepest_sleep variable.


# 305866 16-Sep-2016 kib

MFC r304285:
Implement userspace gettimeofday(2) with HPET timecounter.


# 302408 07-Jul-2016 gjb

Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle.
Prune svn:mergeinfo from the new branch, as nothing has been merged
here.

Additional commits post-branch will follow.

Approved by: re (implicit)
Sponsored by: The FreeBSD Foundation


/freebsd-11-stable/MAINTAINERS
/freebsd-11-stable/cddl
/freebsd-11-stable/cddl/contrib/opensolaris
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/dtrace/test/tst/common/print
/freebsd-11-stable/cddl/contrib/opensolaris/cmd/zfs
/freebsd-11-stable/cddl/contrib/opensolaris/lib/libzfs
/freebsd-11-stable/contrib/amd
/freebsd-11-stable/contrib/apr
/freebsd-11-stable/contrib/apr-util
/freebsd-11-stable/contrib/atf
/freebsd-11-stable/contrib/binutils
/freebsd-11-stable/contrib/bmake
/freebsd-11-stable/contrib/byacc
/freebsd-11-stable/contrib/bzip2
/freebsd-11-stable/contrib/com_err
/freebsd-11-stable/contrib/compiler-rt
/freebsd-11-stable/contrib/dialog
/freebsd-11-stable/contrib/dma
/freebsd-11-stable/contrib/dtc
/freebsd-11-stable/contrib/ee
/freebsd-11-stable/contrib/elftoolchain
/freebsd-11-stable/contrib/elftoolchain/ar
/freebsd-11-stable/contrib/elftoolchain/brandelf
/freebsd-11-stable/contrib/elftoolchain/elfdump
/freebsd-11-stable/contrib/expat
/freebsd-11-stable/contrib/file
/freebsd-11-stable/contrib/gcc
/freebsd-11-stable/contrib/gcclibs/libgomp
/freebsd-11-stable/contrib/gdb
/freebsd-11-stable/contrib/gdtoa
/freebsd-11-stable/contrib/groff
/freebsd-11-stable/contrib/ipfilter
/freebsd-11-stable/contrib/ldns
/freebsd-11-stable/contrib/ldns-host
/freebsd-11-stable/contrib/less
/freebsd-11-stable/contrib/libarchive
/freebsd-11-stable/contrib/libarchive/cpio
/freebsd-11-stable/contrib/libarchive/libarchive
/freebsd-11-stable/contrib/libarchive/libarchive_fe
/freebsd-11-stable/contrib/libarchive/tar
/freebsd-11-stable/contrib/libc++
/freebsd-11-stable/contrib/libc-vis
/freebsd-11-stable/contrib/libcxxrt
/freebsd-11-stable/contrib/libexecinfo
/freebsd-11-stable/contrib/libpcap
/freebsd-11-stable/contrib/libstdc++
/freebsd-11-stable/contrib/libucl
/freebsd-11-stable/contrib/libxo
/freebsd-11-stable/contrib/llvm
/freebsd-11-stable/contrib/llvm/projects/libunwind
/freebsd-11-stable/contrib/llvm/tools/clang
/freebsd-11-stable/contrib/llvm/tools/lldb
/freebsd-11-stable/contrib/llvm/tools/llvm-dwarfdump
/freebsd-11-stable/contrib/llvm/tools/llvm-lto
/freebsd-11-stable/contrib/mdocml
/freebsd-11-stable/contrib/mtree
/freebsd-11-stable/contrib/ncurses
/freebsd-11-stable/contrib/netcat
/freebsd-11-stable/contrib/ntp
/freebsd-11-stable/contrib/nvi
/freebsd-11-stable/contrib/one-true-awk
/freebsd-11-stable/contrib/openbsm
/freebsd-11-stable/contrib/openpam
/freebsd-11-stable/contrib/openresolv
/freebsd-11-stable/contrib/pf
/freebsd-11-stable/contrib/sendmail
/freebsd-11-stable/contrib/serf
/freebsd-11-stable/contrib/sqlite3
/freebsd-11-stable/contrib/subversion
/freebsd-11-stable/contrib/tcpdump
/freebsd-11-stable/contrib/tcsh
/freebsd-11-stable/contrib/tnftp
/freebsd-11-stable/contrib/top
/freebsd-11-stable/contrib/top/install-sh
/freebsd-11-stable/contrib/tzcode/stdtime
/freebsd-11-stable/contrib/tzcode/zic
/freebsd-11-stable/contrib/tzdata
/freebsd-11-stable/contrib/unbound
/freebsd-11-stable/contrib/vis
/freebsd-11-stable/contrib/wpa
/freebsd-11-stable/contrib/xz
/freebsd-11-stable/crypto/heimdal
/freebsd-11-stable/crypto/openssh
/freebsd-11-stable/crypto/openssl
/freebsd-11-stable/gnu/lib
/freebsd-11-stable/gnu/usr.bin/binutils
/freebsd-11-stable/gnu/usr.bin/cc/cc_tools
/freebsd-11-stable/gnu/usr.bin/gdb
/freebsd-11-stable/lib/libc/locale/ascii.c
/freebsd-11-stable/sys/cddl/contrib/opensolaris
/freebsd-11-stable/sys/contrib/dev/acpica
/freebsd-11-stable/sys/contrib/ipfilter
/freebsd-11-stable/sys/contrib/libfdt
/freebsd-11-stable/sys/contrib/octeon-sdk
/freebsd-11-stable/sys/contrib/x86emu
/freebsd-11-stable/sys/contrib/xz-embedded
/freebsd-11-stable/usr.sbin/bhyve/atkbdc.h
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.c
/freebsd-11-stable/usr.sbin/bhyve/bhyvegc.h
/freebsd-11-stable/usr.sbin/bhyve/console.c
/freebsd-11-stable/usr.sbin/bhyve/console.h
/freebsd-11-stable/usr.sbin/bhyve/pci_fbuf.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.c
/freebsd-11-stable/usr.sbin/bhyve/pci_xhci.h
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.c
/freebsd-11-stable/usr.sbin/bhyve/ps2kbd.h
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.c
/freebsd-11-stable/usr.sbin/bhyve/ps2mouse.h
/freebsd-11-stable/usr.sbin/bhyve/rfb.c
/freebsd-11-stable/usr.sbin/bhyve/rfb.h
/freebsd-11-stable/usr.sbin/bhyve/sockstream.c
/freebsd-11-stable/usr.sbin/bhyve/sockstream.h
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.c
/freebsd-11-stable/usr.sbin/bhyve/usb_emul.h
/freebsd-11-stable/usr.sbin/bhyve/usb_mouse.c
/freebsd-11-stable/usr.sbin/bhyve/vga.c
/freebsd-11-stable/usr.sbin/bhyve/vga.h
# 277900 29-Jan-2015 jhb

Opt for performance over power-saving on Intel CPUs that have a
P-state but not C-state invariant TSC by changing the default behavior
to leaving the TSC enabled as the timecounter and disabling C2+ instead
of disabling the TSC by default.

Discussed with: jkim
Tested by: Jan Kokemuller <jan.kokemueller@gmail.com>


# 277406 20-Jan-2015 neel

Update the vdso timehands only via tc_windup().

Prior to this change CLOCK_MONOTONIC could go backwards when the timecounter
hardware was changed via 'sysctl kern.timecounter.hardware'. This happened
because the vdso timehands update was missing the special treatment in
tc_windup() when changing timecounters.

Reviewed by: kib


# 276724 05-Jan-2015 jhb

On some Intel CPUs with a P-state but not C-state invariant TSC the TSC
may also halt in C2 and not just C3 (it seems that in some cases the BIOS
advertises its C3 state as a C2 state in _CST). Just play it safe and
disable both C2 and C3 states if a user forces the use of the TSC as the
timecounter on such CPUs.

PR: 192316
Differential Revision: https://reviews.freebsd.org/D1441
No objection from: jkim
MFC after: 1 week


# 273800 28-Oct-2014 jhb

Rework virtual machine hypervisor detection.
- Move the existing code to x86/x86/identcpu.c since it is x86-specific.
- If the CPUID2_HV flag is set, assume a hypervisor is present and query
the 0x40000000 leaf to determine the hypervisor vendor ID. Export the
vendor ID and the highest supported hypervisor CPUID leaf via
hv_vendor[] and hv_high variables, respectively. The hv_vendor[]
array is also exported via the hw.hv_vendor sysctl.
- Merge the VMWare detection code from tsc.c into the new probe in
identcpu.c. Add a VM_GUEST_VMWARE to identify vmware and use that in
the TSC code to identify VMWare.

Differential Revision: https://reviews.freebsd.org/D1010
Reviewed by: delphij, jkim, neel


# 273174 16-Oct-2014 davide

Follow up to r225617. In order to maximize the re-usability of kernel code
in userland rename in-kernel getenv()/setenv() to kern_setenv()/kern_getenv().
This fixes a namespace collision with libc symbols.

Submitted by: kmacy
Tested by: make universe


# 271082 04-Sep-2014 jhb

- Move blacklists of broken TSCs out of the printcpuinfo() function
and into the TSC probe routine.
- Initialize cpu_exthigh once in finishidentcpu() which is called
before printcpuinfo() (and matches the behavior on amd64).


# 267992 28-Jun-2014 hselasky

Pull in r267961 and r267973 again. Fix for issues reported will follow.


# 267985 27-Jun-2014 gjb

Revert r267961, r267973:

These changes prevent sysctl(8) from returning proper output,
such as:

1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory


# 267961 27-Jun-2014 hselasky

Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after: 2 weeks
Sponsored by: Mellanox Technologies


# 249625 18-Apr-2013 mav

Introduce kern.timecounter.smp_tsc_adjust tunable (disabled by default) and
respective functionality, allowing to synchronize TSC on APs to match BSP's
during boot. It may be unsafe in general case due to theoretical chance of
later drift if CPUs are using different clock rate or source, but it allows
to use TSC in some cases when difference caused by some initialization bug,
while TSCs are known to increment synchronously.

Reviewed by: jimharris, kib
MFC after: 1 month


# 249324 10-Apr-2013 neel

Unsynchronized TSCs on the host require special handling in bhyve:

- use clock_gettime(2) as the time base for the emulated ACPI timer instead
of directly using rdtsc().

- don't advertise the invariant TSC capability to the guest to discourage it
from using the TSC as its time base.

Discussed with: jhb@ (about making 'smp_tsc' a global)
Reported by: Dan Mack on freebsd-virtualization@
Obtained from: NetApp


# 246212 01-Feb-2013 kib

The change to reduce default smp_tsc_shift caused tsc shift to become
zero on slower machines, which make the fenced get_timecount methods
not used despite needed. Remove the (shift > 0) condition when
selecting the get_timecount() implementation.

Rename smp_tsc_shift to tsc_shift, and apply it for the UP case too.
Allow shift to reach value of 31 instead of 30, as it was previously
(should be nop).

Reorganize the tc quality calculation to remove the conditionally
compiled block. Rename test_smp_tsc() to test_tsc() and provide
separate versions for SMP and UP builds. The check for virtialized
hardware is more natural to perform in the smp version of the
test_tsc(), since it is only done for smp case.

Noted and reviewed by: bde (previous version)
MFC after: 12 days


# 246116 30-Jan-2013 kib

Reduce default shift used to calculate the max frequency for the TSC
timecounter to 1, and correspondingly increase the precision of the
gettimeofday(2) and related functions in the default configuration.

The motivation for the TSC-low timecounter, as described in the
r222866, seems to provide a workaround for the non-serializing
behaviour of the RDTSC on some Intel hardware. Tests demonstrate that
even with the pre-shift of 8, the cross-core non-monotonicity of the
RDTSC is still observed reliably, e.g. on the Nehalems. The r238755
and r238973 implemented the proper fix for the issue.

The pre-shift of 1 is applied to keep TSC not overflowing for the
frequency of hardclock down to 2 sec/intr. The pre-shift is made a
tunable to allow the easy debugging of the issues users could see with
the shift being too low.

Reviewed by: bde
MFC after: 2 weeks


# 239133 07-Aug-2012 jimharris

During TSC synchronization test, use rdtsc() rather than rdtsc32(), to
protect against 32-bit TSC overflow while the sync test is running.

On dual-socket Xeon E5-2600 (SNB) systems with up to 32 threads, there
is non-trivial chance (2-3%) that TSC synchronization test fails due to
32-bit TSC overflow while the synchronization test is running.

Sponsored by: Intel
Reviewed by: jkim
Discussed with: jkim, kib


# 238975 01-Aug-2012 kib

Do a trivial reformatting of the comment, to record the proper commit
message for r238973:

Rdtsc instruction is not synchronized, it seems on some Intel cores it
can bypass even the locked instructions. As a result, rdtsc executed
on different cores may return unordered TSC values even when the rdtsc
appearance in the instruction sequences is provably ordered.

Similarly to what has been done in r238755 for TSC synchronization
test, add explicit fences right before rdtsc in the timecounters 'get'
functions. Intel recommends to use LFENCE, while AMD refers to
MFENCE. For VIA follow what Linux does and use LFENCE. With this
change, I see no reordered reads of TSC on Nehalem.

Change the rmb() to inlined CPUID in the SMP TSC synchronization test.
On i386, locked instruction is used for rmb(), and as noted earlier,
it is not enough. Since i386 machine may not support SSE2, do simplest
possible synchronization with CPUID.

MFC after: 1 week
Discussed with: avg, bde, jkim


# 238973 01-Aug-2012 kib

diff --git a/sys/x86/x86/tsc.c b/sys/x86/x86/tsc.c
index c253a96..3d8bd30 100644
--- a/sys/x86/x86/tsc.c
+++ b/sys/x86/x86/tsc.c
@@ -82,7 +82,11 @@ static void tsc_freq_changed(void *arg, const struct cf_level *level,
static void tsc_freq_changing(void *arg, const struct cf_level *level,
int *status);
static unsigned tsc_get_timecount(struct timecounter *tc);
-static unsigned tsc_get_timecount_low(struct timecounter *tc);
+static inline unsigned tsc_get_timecount_low(struct timecounter *tc);
+static unsigned tsc_get_timecount_lfence(struct timecounter *tc);
+static unsigned tsc_get_timecount_low_lfence(struct timecounter *tc);
+static unsigned tsc_get_timecount_mfence(struct timecounter *tc);
+static unsigned tsc_get_timecount_low_mfence(struct timecounter *tc);
static void tsc_levels_changed(void *arg, int unit);

static struct timecounter tsc_timecounter = {
@@ -262,6 +266,10 @@ probe_tsc_freq(void)
(vm_guest == VM_GUEST_NO &&
CPUID_TO_FAMILY(cpu_id) >= 0x10))
tsc_is_invariant = 1;
+ if (cpu_feature & CPUID_SSE2) {
+ tsc_timecounter.tc_get_timecount =
+ tsc_get_timecount_mfence;
+ }
break;
case CPU_VENDOR_INTEL:
if ((amd_pminfo & AMDPM_TSC_INVARIANT) != 0 ||
@@ -271,6 +279,10 @@ probe_tsc_freq(void)
(CPUID_TO_FAMILY(cpu_id) == 0xf &&
CPUID_TO_MODEL(cpu_id) >= 0x3))))
tsc_is_invariant = 1;
+ if (cpu_feature & CPUID_SSE2) {
+ tsc_timecounter.tc_get_timecount =
+ tsc_get_timecount_lfence;
+ }
break;
case CPU_VENDOR_CENTAUR:
if (vm_guest == VM_GUEST_NO &&
@@ -278,6 +290,10 @@ probe_tsc_freq(void)
CPUID_TO_MODEL(cpu_id) >= 0xf &&
(rdmsr(0x1203) & 0x100000000ULL) == 0)
tsc_is_invariant = 1;
+ if (cpu_feature & CPUID_SSE2) {
+ tsc_timecounter.tc_get_timecount =
+ tsc_get_timecount_lfence;
+ }
break;
}

@@ -328,16 +344,31 @@ init_TSC(void)

#ifdef SMP

-/* rmb is required here because rdtsc is not a serializing instruction. */
-#define TSC_READ(x) \
-static void \
-tsc_read_##x(void *arg) \
-{ \
- uint32_t *tsc = arg; \
- u_int cpu = PCPU_GET(cpuid); \
- \
- rmb(); \
- tsc[cpu * 3 + x] = rdtsc32(); \
+/*
+ * RDTSC is not a serializing instruction, and does not drain
+ * instruction stream, so we need to drain the stream before executing
+ * it. It could be fixed by use of RDTSCP, except the instruction is
+ * not available everywhere.
+ *
+ * Use CPUID for draining in the boot-time SMP constistency test. The
+ * timecounters use MFENCE for AMD CPUs, and LFENCE for others (Intel
+ * and VIA) when SSE2 is present, and nothing on older machines which
+ * also do not issue RDTSC prematurely. There, testing for SSE2 and
+ * vendor is too cumbersome, and we learn about TSC presence from
+ * CPUID.
+ *
+ * Do not use do_cpuid(), since we do not need CPUID results, which
+ * have to be written into memory with do_cpuid().
+ */
+#define TSC_READ(x) \
+static void \
+tsc_read_##x(void *arg) \
+{ \
+ uint32_t *tsc = arg; \
+ u_int cpu = PCPU_GET(cpuid); \
+ \
+ __asm __volatile("cpuid" : : : "eax", "ebx", "ecx", "edx"); \
+ tsc[cpu * 3 + x] = rdtsc32(); \
}
TSC_READ(0)
TSC_READ(1)
@@ -487,7 +518,16 @@ init:
for (shift = 0; shift < 31 && (tsc_freq >> shift) > max_freq; shift++)
;
if (shift > 0) {
- tsc_timecounter.tc_get_timecount = tsc_get_timecount_low;
+ if (cpu_feature & CPUID_SSE2) {
+ if (cpu_vendor_id == CPU_VENDOR_AMD) {
+ tsc_timecounter.tc_get_timecount =
+ tsc_get_timecount_low_mfence;
+ } else {
+ tsc_timecounter.tc_get_timecount =
+ tsc_get_timecount_low_lfence;
+ }
+ } else
+ tsc_timecounter.tc_get_timecount = tsc_get_timecount_low;
tsc_timecounter.tc_name = "TSC-low";
if (bootverbose)
printf("TSC timecounter discards lower %d bit(s)\n",
@@ -599,16 +639,48 @@ tsc_get_timecount(struct timecounter *tc __unused)
return (rdtsc32());
}

-static u_int
+static inline u_int
tsc_get_timecount_low(struct timecounter *tc)
{
uint32_t rv;

__asm __volatile("rdtsc; shrd %%cl, %%edx, %0"
- : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx");
+ : "=a" (rv) : "c" ((int)(intptr_t)tc->tc_priv) : "edx");
return (rv);
}

+static u_int
+tsc_get_timecount_lfence(struct timecounter *tc __unused)
+{
+
+ lfence();
+ return (rdtsc32());
+}
+
+static u_int
+tsc_get_timecount_low_lfence(struct timecounter *tc)
+{
+
+ lfence();
+ return (tsc_get_timecount_low(tc));
+}
+
+static u_int
+tsc_get_timecount_mfence(struct timecounter *tc __unused)
+{
+
+ mfence();
+ return (rdtsc32());
+}
+
+static u_int
+tsc_get_timecount_low_mfence(struct timecounter *tc)
+{
+
+ mfence();
+ return (tsc_get_timecount_low(tc));
+}
+
uint32_t
cpu_fill_vdso_timehands(struct vdso_timehands *vdso_th)
{


# 238755 24-Jul-2012 jimharris

Add rmb() to tsc_read_##x to enforce serialization of rdtsc captures.

Intel Architecture Manual specifies that rdtsc instruction is not serialized,
so without this change, TSC synchronization test would periodically fail,
resulting in use of HPET timecounter instead of TSC-low. This caused
severe performance degradation (40-50%) when running high IO/s workloads due to
HPET MMIO reads and GEOM stat collection.

Tests on Xeon E5-2600 (Sandy Bridge) 8C systems were seeing TSC synchronization
fail approximately 20% of the time.

Sponsored by: Intel
Reviewed by: kib
MFC after: 3 days


# 237433 22-Jun-2012 kib

Implement mechanism to export some kernel timekeeping data to
usermode, using shared page. The structures and functions have vdso
prefix, to indicate the intended location of the code in some future.

The versioned per-algorithm data is exported in the format of struct
vdso_timehands, which mostly repeats the content of in-kernel struct
timehands. Usermode reading of the structure can be lockless.
Compatibility export for 32bit processes on 64bit host is also
provided. Kernel also provides usermode with indication about
currently used timecounter, so that libc can fall back to syscall if
configured timecounter is unknown to usermode code.

The shared data updates are initiated both from the tc_windup(), where
a fast task is queued to do the update, and from sysctl handlers which
change timecounter. A manual override switch
kern.timecounter.fast_gettime allows to turn off the mechanism.

Only x86 architectures export the real algorithm data, and there, only
for tsc timecounter. HPET counters page could be exported as well, but
I prefer to not further glue the kernel and libc ABI there until
proper vdso-based solution is developed.

Minimal stubs neccessary for non-x86 architectures to still compile
are provided.

Discussed with: bde
Reviewed by: jhb
Tested by: flo
MFC after: 1 month


# 225069 22-Aug-2011 silby

Disable TSC usage inside SMP VM environments. On my VMware ESXi 4.1
environment with a core i5-2500K, operation in this mode causes timeouts
from the mpt driver. Switching to the ACPI-fast timer resolves this issue.
Switching the VM back to single CPU mode also works, which is why I have
not disabled the TSC in that mode.

I did not test with KVM or other VM environments, but I am being cautious
and assuming that the TSC is not reliable in SMP mode there as well.

Reviewed by: kib
Approved by: re (kib)
MFC after: Not applicable, the timecounter code is new for 9.x


# 224042 14-Jul-2011 jkim

If TSC stops ticking in C3, disable deep sleep when the user forcefully
select TSC as timecounter hardware.

Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de)


# 223426 22-Jun-2011 jkim

Set negative quality to TSC timecounter when C3 state is enabled for Intel
processors unless the invariant TSC bit of CPUID is set. Intel processors
may stop incrementing TSC when DPSLP# pin is asserted, according to Intel
processor manuals, i. e., TSC timecounter is useless if the processor can
enter deep sleep state (C3/C4). This problem was accidentally uncovered by
r222869, which increased timecounter quality of P-state invariant TSC, e.g.,
for Core2 Duo T5870 (Family 6, Model f) and Atom N270 (Family 6, Model 1c).

Reported by: Fabian Keil (freebsd-listen at fabiankeil dot de)
Ian FREISLICH (ianf at clue dot co dot za)
Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de)
- Core2 Duo T5870 (C3 state available/enabled)
jkim - Xeon X5150 (C3 state unavailable)


# 223211 17-Jun-2011 jkim

Teach the compiler how to shift TSC value efficiently. As noted in r220631,
some times compiler inserts redundant instructions to preserve unused upper
32 bits even when it is casted to a 32-bit value. Unfortunately, it seems
the problem becomes more serious when it is shifted, especially on amd64.


# 222884 08-Jun-2011 jkim

Tidy up r222866.

- Re-add accidentally removed atomic op. for sysctl(9) handler.
- Remove a period(`.') at the end of a debugging message.
- Consistently spell "low" for "TSC-low" timecounter throughout.

Pointed out by: bde


# 222869 08-Jun-2011 jkim

Increase quality of TSC (or TSC-low) timecounter to 1000 if it is P-state
invariant. For SMP case (TSC-low), it also has to pass SMP synchronization
test and the CPU vendor/model has to be white-listed explicitly. Currently,
all Intel CPUs and single-socket AMD Family 15h processors are listed here.

Discussed with: hackers


# 222866 08-Jun-2011 jkim

Introduce low-resolution TSC timecounter "TSC-low". It replaces the normal
TSC timecounter if TSC frequency is higher than ~4.29 MHz (or 2^32-1 Hz) or
multiple CPUs are present. The "TSC-low" frequency is always lower than a
preset maximum value and derived from TSC frequency (by being halved until
it becomes lower than the maximum). Note the maximum value for SMP case is
significantly lower than UP case because we want to reduce (rare but known)
"temporal anomalies" caused by non-serialized RDTSC instruction. Normally,
it is still higher than "ACPI-fast" timecounter frequency (which was default
timecounter hardware for long time until r222222) to be useful.


# 222864 08-Jun-2011 jkim

Remove a redundant assignment since r221703.


# 221703 09-May-2011 jkim

Implement boot-time TSC synchronization test for SMP. This test is executed
when the user has indicated that the system has synchronized TSCs or it has
P-state invariant TSCs. For the former case, we may clear the tunable if it
fails the test to prevent accidental foot-shooting. For the latter case, we
may set it if it passes the test to notify the user that it may be usable.


# 221331 02-May-2011 jkim

Fix build with clang. Please note there is an LLVM/Clang PR:

http://llvm.org/bugs/show_bug.cgi?id=9379

Reported by: rpaulo, dim


# 221214 29-Apr-2011 jkim

Detect VMware guest and set the TSC frequency as reported by the hypervisor.
VMware products virtualize TSC and it run at fixed frequency in so-called
"apparent time". Although virtualized i8254 also runs in apparent time, TSC
calibration always gives slightly off frequency because of the complicated
timer emulation and lost-tick correction mechanism.


# 221178 28-Apr-2011 jkim

Turn off periodic recalibration of CPU ticker frequency if it is invariant.


# 220637 14-Apr-2011 jkim

Work around an emulator problem where virtual CPU advertises TSC is P-state
invariant and APERF/MPERF MSRs exist but these MSRs never tick. When we
calculate effective frequency from cpu_est_clockrate(), it caused panic of
division-by-zero. Now we test whether these MSRs actually increase to avoid
such foot-shooting.

Reported by: dim
Tested by: dim


# 220632 14-Apr-2011 jkim

Use newly added rdtsc32() for the timecounter_get_t method.


# 220613 13-Apr-2011 jkim

Add some tunable descriptions about x86 timers.

Requested by: arundel


# 220579 12-Apr-2011 jkim

Probe capability to find effective frequency. When the TSC is P-state
invariant, APERF/MPERF ratio can be used to find effective frequency.


# 220577 12-Apr-2011 jkim

Add a new tunable 'machdep.disable_tsc_calibration' to allow skipping TSC
frequency calibration. For Intel processors, if brand string from CPUID
contains its nominal frequency, this frequency is used instead.


# 220433 07-Apr-2011 jkim

Use atomic load & store for TSC frequency. It may be overkill for amd64 but
safer for i386 because it can be easily over 4 GHz now. More worse, it can
be easily changed by user with 'machdep.tsc_freq' tunable (directly) or
cpufreq(4) (indirectly). Note it is intentionally not used in performance
critical paths to avoid performance regression (but we should, in theory).
Alternatively, we may add "virtual TSC" with lower frequency if maximum
frequency overflows 32 bits (and ignore possible incoherency as we do now).


# 219700 16-Mar-2011 jkim

Revert r219676.

Requested by: jhb, bde


# 219676 15-Mar-2011 jkim

Do not let machdep.tsc_freq modify tsc_freq itself. It is bad for i386 as
it does not operate atomically. Actually, it serves no purpose.

Noticed by: bde


# 219673 15-Mar-2011 jkim

Deprecate tsc_present as the last of its real consumers finally disappeared.


# 219473 10-Mar-2011 jkim

Add a tunable "machdep.disable_tsc" to turn off TSC. Specifically, it turns
off boot-time CPU frequency calibration, DELAY(9) with TSC, and using TSC as
a CPU ticker. Note tsc_present does not change by this tunable.


# 219469 10-Mar-2011 jkim

Turn off pointless P-state invariant TSC detection based on CPU model
on a virtual machine.


# 219461 10-Mar-2011 jkim

Deprecate rarely used tsc_is_broken. Instead, we zero out tsc_freq because
it is almost always used with tsc_freq any way.


# 217616 19-Jan-2011 mdf

Introduce signed and unsigned version of CTLTYPE_QUAD, renaming
existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().


# 216337 09-Dec-2010 jkim

Remove AMD Family 0Fh, Model 6Bh, Stepping 2 from the list of P-state
invariant CPUs. I do not believe this model is P-state invariant any more.
Maybe cpufreq(4) was broken at the time of commit. :-(


# 216283 07-Dec-2010 jkim

Merge sys/amd64/amd64/tsc.c and sys/i386/i386/tsc.c and move to sys/x86/x86.

Discussed with: avg


# 216279 07-Dec-2010 jkim

Use int for 'tsc_present' instead of u_int. It is just a boolean.


# 216276 07-Dec-2010 jkim

Remove stale comments about P-state invariant TSC and fix style(9) nits.


# 216274 07-Dec-2010 jkim

Now the P-state invariant TSC is probed early enough, do not register event
handlers for CPU freqency changes when it is found P-state invariant.
Adjust a comment about non-existent tsc_freq_max() while I am here.


# 216272 07-Dec-2010 jkim

Probe P-state invariant TSC from rightful place.


# 216163 03-Dec-2010 jkim

Revert r216161. It is not necessary because we zero-fill BSS anyway.

Requested by: jhb


# 216161 03-Dec-2010 jkim

Explicitly initialize TSC frequency. To calibrate TSC frequency, we use
DELAY(9) and it may use TSC in turn if TSC frequency is non-zero.

MFC after: 3 days


# 216159 03-Dec-2010 jkim

Do not change CPU ticker frequency if TSC is P-state invariant. Note this
change was meant to be committed with r184102 (and its subsequent MFCs) but
it fell off somehow.

Pointyhat to: jkim
MFC after: 3 days


# 211082 08-Aug-2010 dwmalone

Don't pass sizeof(u_int) to an argument of SYSCLT_PROC that ends up not
being used.


# 209103 12-Jun-2010 mav

Check general TSC presence before doing more specific checks and printfs.


# 184108 21-Oct-2008 jkim

Fix 'kern.timeconter.invariant_tsc' tunable and back out a redundant hack.
Somehow incomplete version was committed. :-(


# 184102 20-Oct-2008 jkim

Turn off CPU frequency change notifiers when the TSC is P-state invariant
or it is forced by setting 'kern.timecounter.invariant_tsc' tunable
to non-zero.


# 170289 04-Jun-2007 dwmalone

Despite several examples in the kernel, the third argument of
sysctl_handle_int is not sizeof the int type you want to export.
The type must always be an int or an unsigned int.

Remove the instances where a sizeof(variable) is passed to stop
people accidently cut and pasting these examples.

In a few places this was sysctl_handle_int was being used on 64 bit
types, which would truncate the value to be exported. In these
cases use sysctl_handle_quad to export them and change the format
to Q so that sysctl(1) can still print them.


# 167905 26-Mar-2007 njl

Add an interface for drivers to be notified of changes to CPU frequency.
cpufreq_pre_change is called before the change, giving each driver a chance
to revoke the change. cpufreq_post_change provides the results of the
change (success or failure). cpufreq_levels_changed gives the unit number
of the cpufreq device whose number of available levels has changed. Hook
in all the drivers I could find that needed it.

* TSC: update TSC frequency value. When the available levels change, take the
highest possible level and notify the timecounter set_cputicker() of that
freq. This gets rid of the "calcru: runtime went backwards" messages.
* identcpu: updates the sysctl hw.clockrate value
* Profiling: if profiling is active when the clock changes, let the user
know the results may be inaccurate.

Reviewed by: bde, phk
MFC after: 1 month


# 160964 04-Aug-2006 yar

Commit the results of the typo hunt by Darren Pilgrim.
This change affects documentation and comments only,
no real code involved.

PR: misc/101245
Submitted by: Darren Pilgrim <darren pilgrim bitfreak org>
Tested by: md5(1)
MFC after: 1 week


# 155534 11-Feb-2006 phk

CPU time accounting speedup (step 2)

Keep accounting time (in per-cpu) cputicks and the statistics counts
in the thread and summarize into struct proc when at context switch.

Don't reach across CPUs in calcru().

Add code to calibrate the top speed of cpu_tickrate() for variable
cpu_tick hardware (like TSC on power managed machines).

Don't enforce monotonicity (at least for now) in calcru. While the
calibrated cpu_tickrate ramps up it may not be true.

Use 27MHz counter on i386/Geode.

Use TSC on amd64 & i386 if present.

Use tick counter on sparc64


# 121307 21-Oct-2003 silby

Change all SYSCTLS which are readonly and have a related TUNABLE
from CTLFLAG_RD to CTLFLAG_RDTUN so that sysctl(8) can provide
more useful error messages.


# 119452 25-Aug-2003 obrien

Fix copyright comment & FBSDID style nits.

Requested by: bde


# 118987 16-Aug-2003 phk

Give timecounters a numeric quality field.

A timecounter will be selected when registered if its quality is
not negative and no less than the current timecounters.

Add a sysctl to report all available timecounters and their qualities.

Give the dummy timecounter a solid negative quality of minus a million.

Give the i8254 zero and the ACPI 1000.

The TSC gets 800, unless APM or SMP forces it negative.

Other timecounters default to zero quality and thereby retain current
selection behaviour.


# 118550 06-Aug-2003 phk

Dont initialize a TSC timecounter until we know if it is broken or not.


# 115683 02-Jun-2003 obrien

Use __FBSDID().


# 113348 10-Apr-2003 des

Convert the SMP_TSC kernel option into a loader tunable. Also enable
the TSC timecounter on single-CPU systems even when they are running
an SMP kernel.


# 113100 04-Apr-2003 tegge

Add SMP_TSC option, which can be used on SMP systems where the TSCs
are synchronized to reduce context switch cost.


# 112367 18-Mar-2003 phk

Including <sys/stdint.h> is (almost?) universally only to be able to use
%j in printfs, so put a newsted include in <sys/systm.h> where the printf
prototype lives and save everybody else the trouble.


# 110379 05-Feb-2003 phk

This file has no longer any content from the original Berkeley file so
replace the UCB copyright with a FreeBSD 2 clause thing.

Remove some no longer relevant comments.


# 110370 05-Feb-2003 phk

i386/i386/tsc.c was repo-copied from i386/isa/clock.c.

Remove all the stuff that does not relate to the TSC.

Change the calibration to use DELAY(1000000) rather than trying to check
it against the CMOS RTC, this drastically increases precision:

Using 25 samples on a Athlon 700MHz UP machine I find:

stddev min max average
CMOS 22200 Hz -74980 Hz 34301 Hz 704928721 Hz
DELAY 1805 Hz -1984 Hz 2678 Hz 704937583 Hz

(The difference between the two averages is not statistically significant.)

expressed in PPM of the frequency:
stddev min max
CMOS 31.49 PPM -106.37 PPM 48.66 PPM
DELAY 2.56 PPM 2.81 PPM 3.80 PPM

This code will not be used until a followup commit to sys/isa/clock.c
and sys/pc98/pc98/clock.c which will only happen after some field testing.


# 110299 03-Feb-2003 phk

Split the global timezone structure into two integer fields to
prevent the compiler from optimizing assignments into byte-copy
operations which might make access to the individual fields non-atomic.

Use the individual fields throughout, and don't bother locking them with
Giant: it is no longer needed.

Inspired by: tjr


# 110296 03-Feb-2003 jake

Split statclock into statclock and profclock, and made the method for driving
statclock based on profhz when profiling is enabled MD, since most platforms
don't use this anyway. This removes the need for statclock_process, whose
only purpose was to subdivide profhz, and gets the profiling clock running
outside of sched_lock on platforms that implement suswintr.
Also changed the interface for starting and stopping the profiling clock to
do just that, instead of changing the rate of statclock, since they can now
be separate.

Reviewed by: jhb, tmm
Tested on: i386, sparc64


# 110039 29-Jan-2003 phk

Make tsc_freq a 64bit quantity.

Inspired by: http://www.theinquirer.net/?article=7481


# 107576 04-Dec-2002 phk

Use the correct value when writing the Day Of Week byte in the CMOS.
The correct range is [1...7] with Sunday=1, but we have been writing
[0...6] with Sunday=0.

The Soekris computers flagged the zero, zapped the date, so if you
rebooted your soekris on a sunday, it would come up with a wrong
date.

Bruce has a more extensive rework of this code, but we will stick with
the minimalist fix for now.

Spotted by: Soren Kristensen <soren@soekris.com>
Thanks to: Michael Sierchio <kudzu@tenebras.com>.
Confirmed by: bde
Approved by: re


# 105328 17-Oct-2002 iwasaki

1. Fix a comment. Locking _is_ needed (but not done).
2. Update a comment. We now restore much more than RTC updates and
interrupts.
3. Order change. Stop interrupts by writing to RTC_STATUSB,
restore rate bits for the interrupts by writing to RTC_STATUSA,
then enable interrupts again.
This seems to be done perfectly backwards in startrtclock().
Otherwise, the idea for this change was obtained from
startrtclock().
4. Don't stop the clock (RTCB_HALT). We only program some control bits
and don't want to stop the clock.
5. (Not really related.) Add caveats to the comment about timer_restore().
The update is non-atomic since locking is not done.

On locking:
6. rtcin() and writertc() are locked() adequately by splhigh() in RELENG_4,
but this locking is null in -current.
7. Doing things in the correct order in (3) combined with (6) is probably
enough locking for rtcrestore() in RELENG_4. In -current, the
writertc()'s race with rtcintr() unless the BIOS disables RTC interrupts.

Submitted by: bde (including commit message)
MFC after: 1 week


# 103733 21-Sep-2002 phk

Fix a 3 year old oversight: Remove the #ifdef/#endif pair now that there
is nothing between them anymore.

Spotted by: peter.


# 103527 18-Sep-2002 iwasaki

Restore status register A of RTC at resume time.
This should fix the 'too many RTC interrupts and statclock seems
broken after resume' problem.

MFC after: 1 week


# 98618 22-Jun-2002 mp

Clock frequencies reported by sysctl should be unsigned values. Discovered
when machdep.tsc_freq returned a negative number on a 2.2GHz Xeon.

Submitted by: Brian Harrison <bharrison@ironport.com>
Reviewed by: phk
MFC after: 1 week


# 95814 30-Apr-2002 phk

Don't export timecounter structures under debug. with sysctl, they
contain no truly interesting data anymore.


# 95489 26-Apr-2002 phk

Remove the tc_update() function. Any frequency change to the
timecounter will be used starting at the next second, which is
good enough for sysctl purposes. If better adjustment is needed
the NTP PLL should be used.


# 93264 27-Mar-2002 dillon

Compromise for critical*()/cpu_critical*() recommit. Cleanup the interrupt
disablement assumptions in kern_fork.c by adding another API call,
cpu_critical_fork_exit(). Cleanup the td_savecrit field by moving it
from MI to MD. Temporarily move cpu_critical*() from <arch>/include/cpufunc.h
to <arch>/<arch>/critical.c (stage-2 will clean this up).

Implement interrupt deferral for i386 that allows interrupts to remain
enabled inside critical sections. This also fixes an IPI interlock bug,
and requires uses of icu_lock to be enclosed in a true interrupt disablement.

This is the stage-1 commit. Stage-2 will occur after stage-1 has stabilized,
and will move cpu_critical*() into its own header file(s) + other things.
This commit may break non-i386 architectures in trivial ways. This should
be temporary.

Reviewed by: core
Approved by: core


# 92765 20-Mar-2002 alfred

Remove __P.


# 91328 26-Feb-2002 dillon

revert last commit temporarily due to whining on the lists.


# 91315 26-Feb-2002 dillon

STAGE-1 of 3 commit - allow (but do not require) interrupts to remain
enabled in critical sections and streamline critical_enter() and
critical_exit().

This commit allows an architecture to leave interrupts enabled inside
critical sections if it so wishes. Architectures that do not wish to do
this are not effected by this change.

This commit implements the feature for the I386 architecture and provides
a sysctl, debug.critical_mode, which defaults to 1 (use the feature). For
now you can turn the sysctl on and off at any time in order to test the
architectural changes or track down bugs.

This commit is just the first stage. Some areas of the code, specifically
the MACHINE_CRITICAL_ENTER #ifdef'd code, is strictly temporary and will
be cleaned up in the STAGE-2 commit when the critical_*() functions are
moved entirely into MD files.

The following changes have been made:

* critical_enter() and critical_exit() for I386 now simply increment
and decrement curthread->td_critnest. They no longer disable
hard interrupts. When critical_exit() decrements the counter to
0 it effectively calls a routine to deal with whatever interrupts
were deferred during the time the code was operating in a critical
section.

Other architectures are unaffected.

* fork_exit() has been conditionalized to remove MD assumptions for
the new code. Old code will still use the old MD assumptions
in regards to hard interrupt disablement. In STAGE-2 this will
be turned into a subroutine call into MD code rather then hardcoded
in MI code.

The new code places the burden of entering the critical section
in the trampoline code where it belongs.

* I386: interrupts are now enabled while we are in a critical section.
The interrupt vector code has been adjusted to deal with the fact.
If it detects that we are in a critical section it currently defers
the interrupt by adding the appropriate bit to an interrupt mask.

* In order to accomplish the deferral, icu_lock is required. This
is i386-specific. Thus icu_lock can only be obtained by mainline
i386 code while interrupts are hard disabled. This change has been
made.

* Because interrupts may or may not be hard disabled during a
context switch, cpu_switch() can no longer simply assume that
PSL_I will be in a consistent state. Therefore, it now saves and
restores eflags.

* FAST INTERRUPT PROVISION. Fast interrupts are currently deferred.
The intention is to eventually allow them to operate either while
we are in a critical section or, if we are able to restrict the
use of sched_lock, while we are not holding the sched_lock.

* ICU and APIC vector assembly for I386 cleaned up. The ICU code
has been cleaned up to match the APIC code in regards to format
and macro availability. Additionally, the code has been adjusted
to deal with deferred interrupts.

* Deferred interrupts use a per-cpu boolean int_pending, and
masks ipending, spending, and fpending. Being per-cpu variables
it is not currently necessary to lock; bus cycles modifying them.

Note that the same mechanism will enable preemption to be
incorporated as a true software interrupt without having to
further hack up the critical nesting code.

* Note: the old critical_enter() code in kern/kern_switch.c is
currently #ifdef to be compatible with both the old and new
methodology. In STAGE-2 it will be moved entirely to MD code.

Performance issues:

One of the purposes of this commit is to enhance critical section
performance, specifically to greatly reduce bus overhead to allow
the critical section code to be used to protect per-cpu caches.
These caches, such as Jeff's slab allocator work, can potentially
operate very quickly making the effective savings of the new
critical section code's performance very significant.

The second purpose of this commit is to allow architectures to
enable certain interrupts while in a critical section. Specifically,
the intention is to eventually allow certain FAST interrupts to
operate rather then defer.

The third purpose of this commit is to begin to clean up the
critical_enter()/critical_exit()/cpu_critical_enter()/
cpu_critical_exit() API which currently has serious cross pollution
in MI code (in fork_exit() and ast() for example).

The fourth purpose of this commit is to provide a framework that
allows kernel-preempting software interrupts to be implemented
cleanly. This is currently used for two forward interrupts in I386.
Other architectures will have the choice of using this infrastructure
or building the functionality directly into critical_enter()/
critical_exit().

Finally, this commit is designed to greatly improve the flexibility
of various architectures to manage critical section handling,
software interrupts, preemption, and other highly integrated
architecture-specific details.


# 89980 30-Jan-2002 bde

Don't include <isa/isavar.h> or compile code depending on it when isa
is not configured. Including <isa/isavar.h> when it is not used is
harmful as well as bogus, since it includes "isa_if.h" which is not
generated when isa is not configured.

This was fixed in 1999 but was broken by unconditionalizing PNPBIOS.


# 88322 20-Dec-2001 jhb

Introduce a standard name for the lock protecting an interrupt controller
and it's associated state variables: icu_lock with the name "icu". This
renames the imen_mtx for x86 SMP, but also uses the lock to protect
access to the 8259 PIC on x86 UP. This also adds an appropriate lock to
the various Alpha chipsets which fixes problems with Alpha SMP machines
dropping interrupts with an SMP kernel.


# 85835 01-Nov-2001 iwasaki

Some fix for the recent apm module changes.
- Now that apm loadable module can inform its existence to other kernel
components (e.g. i386/isa/clock.c:startrtclock()'s TCS hack).
- Exchange priority of SI_SUB_CPU and SI_SUB_KLD for above purpose.
- Add simple arbitration mechanism for APM vs. ACPI. This prevents
the kernel enables both of them.
- Remove obsolete `#ifdef DEV_APM' related code.
- Add abstracted interface for Powermanagement operations. Public apm(4)
functions, such as apm_suspend(), should be replaced new interfaces.
Currently only power_pm_suspend (successor of apm_suspend) is implemented.

Reviewed by: peter, arch@ and audit@


# 84721 09-Oct-2001 robert

Remove an unneeded variable declaration and statement.

Approved by: jake


# 82971 04-Sep-2001 iwasaki

Reenable RTC interrupts after wakeup. Some laptops have a problem
with system statistics monitoring tools (such as systat, vmstat...)
because of stopping RTC interrupts generation.
Restore all the timers (RTC and i8254) atomically.

Reviewed by: bde
MFC after: 1 week


# 82555 30-Aug-2001 msmith

Add ACPI attachments.


# 76650 15-May-2001 jhb

Remove unneeded includes of sys/ipl.h and machine/ipl.h.


# 76089 27-Apr-2001 jhb

Add in a missing call to forward_hardclock() in the SMP case.

Submitted by: bde


# 76078 27-Apr-2001 jhb

Overhaul of the SMP code. Several portions of the SMP kernel support have
been made machine independent and various other adjustments have been made
to support Alpha SMP.

- It splits the per-process portions of hardclock() and statclock() off
into hardclock_process() and statclock_process() respectively. hardclock()
and statclock() call the *_process() functions for the current process so
that UP systems will run as before. For SMP systems, it is simply necessary
to ensure that all other processors execute the *_process() functions when the
main clock functions are triggered on one CPU by an interrupt. For the alpha
4100, clock interrupts are delievered in a staggered broadcast fashion, so
we simply call hardclock/statclock on the boot CPU and call the *_process()
functions on the secondaries. For x86, we call statclock and hardclock as
usual and then call forward_hardclock/statclock in the MD code to send an IPI
to cause the AP's to execute forwared_hardclock/statclock which then call the
*_process() functions.
- forward_signal() and forward_roundrobin() have been reworked to be MI and to
involve less hackery. Now the cpu doing the forward sets any flags, etc. and
sends a very simple IPI_AST to the other cpu(s). AST IPIs now just basically
return so that they can execute ast() and don't bother with setting the
astpending or needresched flags themselves. This also removes the loop in
forward_signal() as sched_lock closes the race condition that the loop worked
around.
- need_resched(), resched_wanted() and clear_resched() have been changed to take
a process to act on rather than assuming curproc so that they can be used to
implement forward_roundrobin() as described above.
- Various other SMP variables have been moved to a MI subr_smp.c and a new
header sys/smp.h declares MI SMP variables and API's. The IPI API's from
machine/ipl.h have moved to machine/smp.h which is included by sys/smp.h.
- The globaldata_register() and globaldata_find() functions as well as the
SLIST of globaldata structures has become MI and moved into subr_smp.c.
Also, the globaldata list is only available if SMP support is compiled in.

Reviewed by: jake, peter
Looked over by: eivind


# 74914 28-Mar-2001 jhb

Catch up to header include changes:
- <sys/mutex.h> now requires <sys/systm.h>
- <sys/mutex.h> and <sys/sx.h> now require <sys/lock.h>


# 72678 19-Feb-2001 bde

Fixed style bugs in clock.c rev.1.164 and cpu.h rev.1.52-1.53 -- declare
tsc_present in the right places (together with other variables of the
same linkage), and don't use messy ifdefs just to avoid exporting it in
some cases.


# 72240 09-Feb-2001 jhb

Catch up to changes to inthand_add().


# 72200 09-Feb-2001 bmilekic

Change and clean the mutex lock interface.

mtx_enter(lock, type) becomes:

mtx_lock(lock) for sleep locks (MTX_DEF-initialized locks)
mtx_lock_spin(lock) for spin locks (MTX_SPIN-initialized)

similarily, for releasing a lock, we now have:

mtx_unlock(lock) for MTX_DEF and mtx_unlock_spin(lock) for MTX_SPIN.
We change the caller interface for the two different types of locks
because the semantics are entirely different for each case, and this
makes it explicitly clear and, at the same time, it rids us of the
extra `type' argument.

The enter->lock and exit->unlock change has been made with the idea
that we're "locking data" and not "entering locked code" in mind.

Further, remove all additional "flags" previously passed to the
lock acquire/release routines with the exception of two:

MTX_QUIET and MTX_NOSWITCH

The functionality of these flags is preserved and they can be passed
to the lock/unlock routines by calling the corresponding wrappers:

mtx_{lock, unlock}_flags(lock, flag(s)) and
mtx_{lock, unlock}_spin_flags(lock, flag(s)) for MTX_DEF and MTX_SPIN
locks, respectively.

Re-inline some lock acq/rel code; in the sleep lock case, we only
inline the _obtain_lock()s in order to ensure that the inlined code
fits into a cache line. In the spin lock case, we inline recursion and
actually only perform a function call if we need to spin. This change
has been made with the idea that we generally tend to avoid spin locks
and that also the spin locks that we do have and are heavily used
(i.e. sched_lock) do recurse, and therefore in an effort to reduce
function call overhead for some architectures (such as alpha), we
inline recursion for this case.

Create a new malloc type for the witness code and retire from using
the M_DEV type. The new type is called M_WITNESS and is only declared
if WITNESS is enabled.

Begin cleaning up some machdep/mutex.h code - specifically updated the
"optimized" inlined code in alpha/mutex.h and wrote MTX_LOCK_SPIN
and MTX_UNLOCK_SPIN asm macros for the i386/mutex.h as we presently
need those.

Finally, caught up to the interface changes in all sys code.

Contributors: jake, jhb, jasone (in no particular order)


# 71797 29-Jan-2001 peter

Convert mca (microchannel bus support) from something that we count
(bogus) to something that we test for the presence of.


# 71320 21-Jan-2001 jasone

Remove MUTEX_DECLARE() and MTX_COLD. Instead, postpone full mutex
initialization until after malloc() is safe to call, then iterate through
all mutexes and complete their initialization.

This change is necessary in order to avoid some circular bootstrapping
dependencies.


# 71262 19-Jan-2001 peter

Convert apm from a bogus 'count' into a plain option. Clean out some
other cruft from the files.alpha and files.ia64 that were related to this.


# 69521 02-Dec-2000 markm

Namespace cleanup. Remove some #includes in favour of an explicit
declaration.

Asked for by: bde


# 67759 28-Oct-2000 phk

Revert two experimental changes which escaped from my devel machine.


# 67708 27-Oct-2000 phk

Convert all users of fldoff() to offsetof(). fldoff() is bad
because it only takes a struct tag which makes it impossible to
use unions, typedefs etc.

Define __offsetof() in <machine/ansi.h>

Define offsetof() in terms of __offsetof() in <stddef.h> and <sys/types.h>

Remove myriad of local offsetof() definitions.

Remove includes of <stddef.h> in kernel code.

NB: Kernelcode should *never* include from /usr/include !

Make <sys/queue.h> include <machine/ansi.h> to avoid polluting the API.

Deprecate <struct.h> with a warning. The warning turns into an error on
01-12-2000 and the file gets removed entirely on 01-01-2001.

Paritials reviews by: various.
Significant brucifications by: bde


# 67551 25-Oct-2000 jhb

- Overhaul the software interrupt code to use interrupt threads for each
type of software interrupt. Roughly, what used to be a bit in spending
now maps to a swi thread. Each thread can have multiple handlers, just
like a hardware interrupt thread.
- Instead of using a bitmask of pending interrupts, we schedule the specific
software interrupt thread to run, so spending, NSWI, and the shandlers
array are no longer needed. We can now have an arbitrary number of
software interrupt threads. When you register a software interrupt
thread via sinthand_add(), you get back a struct intrhand that you pass
to sched_swi() when you wish to schedule your swi thread to run.
- Convert the name of 'struct intrec' to 'struct intrhand' as it is a bit
more intuitive. Also, prefix all the members of struct intrhand with
'ih_'.
- Make swi_net() a MI function since there is now no point in it being
MD.

Submitted by: cp


# 67356 20-Oct-2000 jhb

- machine/mutex.h -> sys/mutex.h
- machine/ipl.h -> sys/ipl.h
- Use MUTEX_DECLARE() for clock_lock


# 66716 06-Oct-2000 jhb

- Change fast interrupts on x86 to push a full interrupt frame and to
return through doreti to handle ast's. This is necessary for the
clock interrupts to work properly.
- Change the clock interrupts on the x86 to be fast instead of threaded.
This is needed because both hardclock() and statclock() need to run in
the context of the current process, not in a separate thread context.
- Kill the prevproc hack as it is no longer needed.
- We really need Giant when we call psignal(), but we don't want to block
during the clock interrupt. Instead, use two p_flag's in the proc struct
to mark the current process as having a pending SIGVTALRM or a SIGPROF
and let them be delivered during ast() when hardclock() has finished
running.
- Remove CLKF_BASEPRI, which was #ifdef'd out on the x86 anyways. It was
broken on the x86 if it was turned on since cpl is gone. It's only use
was to bogusly run softclock() directly during hardclock() rather than
scheduling an SWI.
- Remove the COM_LOCK simplelock and replace it with a clock_lock spin
mutex. Since the spin mutex already handles disabling/restoring
interrupts appropriately, this also lets us axe all the *_intr() fu.
- Back out the hacks in the APIC_IO x86 cpu_initclocks() code to use
temporary fast interrupts for the APIC trial.
- Add two new process flags P_ALRMPEND and P_PROFPEND to mark the pending
signals in hardclock() that are to be delivered in ast().

Submitted by: jakeb (making statclock safe in a fast interrupt)
Submitted by: cp (concept of delaying signals until ast())


# 66698 05-Oct-2000 jhb

- Heavyweight interrupt threads on the alpha for device I/O interrupts.
- Make softinterrupts (SWI's) almost completely MI, and divorce them
completely from the x86 hardware interrupt code.
- The ihandlers array is now gone. Instead, there is a MI shandlers array
that just contains SWI handlers.
- Most of the former machine/ipl.h files have moved to a new sys/ipl.h.
- Stub out all the spl*() functions on all architectures.

Submitted by: dfr


# 65822 13-Sep-2000 jhb

- Remove the inthand2_t type and use the equivalent driver_intr_t type from
newbus for referencing device interrupt handlers.
- Move the 'struct intrec' type which describes interrupt sources into
sys/interrupt.h instead of making it just be a x86 structure.
- Don't create 'ithd' and 'intrec' typedefs, instead, just use 'struct ithd'
and 'struct intrec'
- Move the code to translate new-bus interrupt flags into an interrupt thread
priority out of the x86 nexus code and into a MI ithread_priority()
function in sys/kern/kern_intr.c.
- Remove now-uneeded x86-specific headers from sys/dev/ata/ata-all.c and
sys/pci/pci_compat.c.


# 65557 06-Sep-2000 jasone

Major update to the way synchronization is done in the kernel. Highlights
include:

* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The
alpha port is still in transition and currently uses both.)

* Per-CPU idle processes.

* Interrupts are run in their own separate kernel threads and can be
preempted (i386 only).

Partially contributed by: BSDi (BSD/OS)
Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh


# 64031 30-Jul-2000 phk

Allow use of TSC even if APM is compiled in but disabled.


# 62573 04-Jul-2000 phk

Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.

Pointed out by: bde


# 62454 03-Jul-2000 phk

Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:

Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our
sources:

-sysctl_vm_zone SYSCTL_HANDLER_ARGS
+sysctl_vm_zone (SYSCTL_HANDLER_ARGS)


# 61994 23-Jun-2000 msmith

Add PnP probe methods to some common AT hardware drivers. In each case,
the PnP probe is merely a stub as we make assumptions about some of this
hardware before we have probed it.

Since these devices (with the exception of the speaker) are 'standard',
suppress output in the !bootverbose case to clean up the probe messages
somewhat.


# 61126 31-May-2000 bde

Add SWI_TQ_MASK to all interrupt masks except SWI_CLOCK_MASK. Use a
new macro SWI_LOW_MASK to give the mask for low priority SWIs instead
of hard-coding this mask as SWI_CLOCK_MASK.

Reviewed by: dfr


# 58377 20-Mar-2000 phk

Isolate the Timecounter internals in their own two files.

Make the public interface more systematically named.

Remove the alternate method, it doesn't do any good, only ruins performance.

Add counters to profile the usage of the 8 access functions.

Apply the beer-ware to my code.

The weird +/- counts are caused by two repocopies behind the scenes:
kern/kern_clock.c -> kern/kern_tc.c
sys/time.h -> sys/timetc.h
(thanks peter!)


# 55420 04-Jan-2000 tegge

ISA device drivers use the ISA source interrupt number in locations where
the low level interrupt handler number should be used. Change
setup_apic_irq_mapping() to allocate low level interrupt handler X (Xintr${X})
for any ISA interrupt X mentioned in the MP table.

Remove an assumption in the driver for the system clock (clock.c) that
interrupts mentioned in the MP table as delivered to IOAPIC #0 intpin Y
is handled by low level interrupt handler Y (Xintr${Y}) but don't assume
that low level interrupt handler 0 (Xintr0) is used.

Don't allocate two low level interrupt handlers for the system clock.
Reviewed by: NOKUBI Hirotaka <hnokubi@yyy.or.jp>


# 55098 25-Dec-1999 bde

Fixed races accessing the RTC. The races apparently caused
apm_default_resume() to sometimes set a very wrong time.
(1) Accesses to the RTC index and data registers were not atomic enough.
Interrupts were not masked. This was only good enough until an
interrupt handler (rtcintr()) started accessing the RTC in FreeBSD-2.0.
(2) Access to the block of time registers in inittodr() was not atomic
enough. inittodr() has 244us to read the time registers. Interrupts
were not masked. This was only good enough until something (apm)
started calling inittodr() after boot time in FreeBSD-2.0.
The fix for (2) also makes the timecounter update more atomic, although
this is currently unimportant due to the low resolution of the RTC.

Problem reported by: mckay


# 54890 20-Dec-1999 peter

Remove references to register_intr() etc in comments.


# 52669 30-Oct-1999 iwasaki

i8254_restore is called from apm_default_resume() to reload
the countdown register.
this should not be necessary but there are broken laptops that
do not restore the countdown register on resume.
when it happnes, it messes up the hardclock interval and system clock,
which leads to the infamous "calcru: negative time" problem.

Submitted by: kjc, iwasaki
Reviewed by: Steve O'Hara-Smith <steveo@eircom.net> and committers.
Obtained from: PAO3


# 50823 03-Sep-1999 mdodd

This adds the i386 specific support for systems with a MicroChannel
Architecture bus.

Reviewed by: msmith


# 50477 27-Aug-1999 peter

$Id$ -> $FreeBSD$


# 49558 09-Aug-1999 phk

Merge the cons.c and cons.h to the best of my ability. alpha may or
may not compile, I can't test it.


# 49196 28-Jul-1999 green

Remove XXX from the headers (broke the build, I'm betting.)


# 49186 28-Jul-1999 msmith

We're called too early to have any idea whether APM is going to be
active or not. The only sane thing we can do here is assume that if
APM is supported it might be active at some point, and bail.

In reality, even this isn't good enough; regardless of whether we support
APM or not, the system may well futz with the CPU's clock speed and throw
the TSC off. We need to stop using it for timekeeping except under
controlled circumstances. Curse the lack of a dependable high-resolution
timer.


# 48889 18-Jul-1999 bde

Updated acquire_timer2()'s state machine to work when the i8254 is
being used for timecounting. Fixed a race or two in it. Undisabled
it.

PR: 10455


# 48888 18-Jul-1999 bde

Don't let the machdep.tsc_freq sysctl proceed if the TSC is present
but broken, since tsc_timecounter is not initialised in that case,
and updating an uninitialised timecounter is fatal.

Fixed style bugs in the machdep.i8254_freq and machdep.tsc_freq
sysctls.

Reviewed by: phk


# 48266 27-Jun-1999 peter

Shut up gcc.


# 48160 24-Jun-1999 green

This commit gives support for the Rise mP6 CPU. It has two changes:
1. Rise is recognized in identdcpu.c.
2. The TSC is not written to. A workaround for the CPU bug is being
applied to clock.c (the bug being that the mP6 has TSC enabled
in its CPUID-capabilities, but it only supports reading it. If we
try to write to it (MSR 16), a GPF occurs.) The new behavior is that
FreeBSD will _not_ zero the TSC. Instead, we do a bit of 64-bit
arithmetic.

Reviewed by: msmith
Obtained from: unfurl & msmith


# 47642 31-May-1999 dfr

Remove fd driver from its old home and change files which include rtc.h
to account for its new location.


# 47592 29-May-1999 phk

Stop the TSC from being used as timecounter on K5/step0 machines.


# 47588 28-May-1999 bde

Fixed glitches (jumps) of about 1/HZ seconds for the i8254 timecounter.
The old version only worked right when the time was read strictly
more often than every 1/HZ seconds, but we only guarantee reading
it every (1/HZ + epsilon) seconds. Part of rev.1.126-1.127 attempted
to fix this but didn't succeed. Detect counter rollover using the
heuristic from the old version of microtime() with additional
complications for supporting calls from fast interrupt handlers.
This works provided i8254 interrupts are not delayed by more than
1/(2*HZ) seconds.

This needs more comments, and cleanups for the SMP case, and more
testing of the SMP case before it is merged into RELENG_3.

Tested by: jhay


# 46847 09-May-1999 peter

For what it's worth, idelayed is declared as a volatile in the headers,
and even though it's not used in this file make it a volatile here too.


# 46054 25-Apr-1999 phk

Make the machdep.i8254_freq and machdep.tsc_freq sysctls modify the
timecounter as well

Asked for by: bde, jhay


# 45900 21-Apr-1999 peter

oops, SMP was missing includes for a typedef.


# 45897 21-Apr-1999 peter

Stage 1 of a cleanup of the i386 interrupt registration mechanism.
Interrupts under the new scheme are managed by the i386 nexus with the
awareness of the resource manager. There is further room for optimizing
the interfaces still. All the users of register_intr()/intr_create()
should be gone, with the exception of pcic and i386/isa/clock.c.


# 41787 14-Dec-1998 mckay

Fix tabs that should have been spaces. Some were in kernel error messages.


# 40610 23-Oct-1998 phk

Update timecounters to new interface.


# 39526 20-Sep-1998 bde

Attempt to work around a bug in the previous commit related to
non-reentrancy of SMP clock locking. Depend on the giant lock
protecting clkintr().


# 39503 20-Sep-1998 bde

Ensure that the i8254 timecounter doesn't go backards. It sometimes
went backwards when interrupts were masked for more than one i8254
interrupt period. It sometimes went backwards when the i8254 counter
was reprogrammed. Neither of these should happen in normal operation.

Update the i8254 timecounter support variables atomically. Calling
timecounter functions from fast interrupt handlers may actually work
in all cases now.


# 38888 06-Sep-1998 tegge

Maintain a mapping from irq number to (ioapic number, int pin) tuple,
and use this when masking/unmasking interrupts.

Maintain a mapping from (iopaic number, int pin) tuple to irq number,
and use this when configuring devices and programming the ioapics.

Previous code assumed that irq number was equal to int pin number, and
that the ioapic number was 0.

Don't let an AP enter _cpu_switch before all local apics are initialized.


# 36810 09-Jun-1998 phk

Add a tc_ prefix to struct timecounter members.

Urged by: bde


# 36741 07-Jun-1998 phk

Add a member function more to the timecounters, this one is for use
with latch based PPS implementations. The client that uses it will
be committed after more testing.


# 36719 07-Jun-1998 phk

Add a "this" style argument and a "void *private" so timecounters can
figure out which instance to wount with.


# 36441 28-May-1998 phk

Some cleanups related to timecounters and weird ifdefs in <sys/time.h>.

Clean up (or if antipodic: down) some of the msgbuf stuff.

Use an inline function rather than a macro for timecounter delta.

Maintain process "on-cpu" time as 64 bits of microseconds to avoid
needless second rollover overhead.

Avoid calling microuptime the second time in mi_switch() if we do
not pass through _idle in cpu_switch()

This should reduce our context-switch overhead a bit, in particular
on pre-P5 and SMP systems.

WARNING: Programs which muck about with struct proc in userland
will have to be fixed.

Reviewed, but found imperfect by: bde


# 36198 19-May-1998 phk

Change a data type internal to the timecounters, and remove the "delta"
function.

Reviewed, but not entirely approved by: bde


# 35035 04-Apr-1998 tegge

Remove some unneeded statements that enabled interrupts.


# 34961 30-Mar-1998 phk

Eradicate the variable "time" from the kernel, using various measures.
"time" wasn't a atomic variable, so splfoo() protection were needed
around any access to it, unless you just wanted the seconds part.

Most uses of time.tv_sec now uses the new variable time_second instead.

gettime() changed to getmicrotime(0.

Remove a couple of unneeded splfoo() protections, the new getmicrotime()
is atomic, (until Bruce sets a breakpoint in it).

A couple of places needed random data, so use read_random() instead
of mucking about with time which isn't random.

Add a new nfs_curusec() function.

Mark a couple of bogosities involving the now disappeard time variable.

Update ffs_update() to avoid the weird "== &time" checks, by fixing the
one remaining call that passwd &time as args.

Change profiling in ncr.c to use ticks instead of time. Resolution is
the same.

Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call
hzto() which subtracts time" sequences.

Reviewed by: bde


# 34617 16-Mar-1998 phk

Be less draconian about the TSC if APM is configured, use it for
timecounting if APM-BIOS isn't found.
Be just as draconian about SMP as always, but explain it better.


# 34571 14-Mar-1998 tegge

On SMP systems, initially follow the MP spec with regard to which pin
on the IOAPIC being connected to the 8254 timer interrupt.
Verify that timer interrupts are delivered. If they aren't, attempt
a fallback to mixed mode (i.e. routing the timer interrupt via the 8259 PIC).


# 34058 05-Mar-1998 tegge

Remove special handling for resuming clock interrupt when using APIC_IO.
The `generic' vector stubs do the right thing.


# 33929 28-Feb-1998 phk

Prevent the TSC from being used on APM machines, we have no idea if
it runs at a constant frequency. This was less of an issue before,
because the TSC only interpolated in the HZ intervals, but now where
the timecounter is used all the way, this becomes much more visible.

Nit: Fix a printf which triggered the bde-filter.


# 33753 22-Feb-1998 bde

Quick fix for the i8254 timecounter often gaining 10 msec.


# 33727 21-Feb-1998 jkh

Add missing CLOCK_UNLOCK() before write_eflags().
Submitted by: dave adkins <adkin003@tc.umn.edu>


# 33690 20-Feb-1998 phk

Replace TOD clock code with more systematic approach.

Highlights:
* Simple model for underlying hardware.
* Hardware basis for timekeeping can be changed on the fly.
* Only one hardware clock responsible for TOD keeping.
* Provides a real nanotime() function.
* Time granularity: .232E-18 seconds.
* Frequency granularity: .238E-12 s/s
* Frequency adjustment is continuous in time.
* Less overhead for frequency adjustment.
* Improves xntpd performance.

Reviewed by: bde, bde, bde


# 33309 13-Feb-1998 bde

Update timer0_prescaler_count before calling hardclock() while timer0
is "acquired". This fixes a TSC biasing error of about 10 msec when
pcaudio is active.

Update `time' before calling hardclock() when timer0 is being released.
This is not known to be important.

Added some delays in writertc(). Efficiency is not critical here, unlike
in rtcin(), and we already use conservative delays there.

Don't touch the hardware when machdep.i8254_freq is being changed but
the maximum count wouldn't change. This fixes jitter of up to 10 msec
for most small adjustments to machdep.i8254_freq. When the maximum
count needs to change, the hardware should be adjusted more carefully.


# 33181 09-Feb-1998 eivind

Staticize.


# 32850 28-Jan-1998 phk

APM calls inittodr(0) which is stupid, but at least stop setting the
clock back to when Dennis had a good idea.


# 32054 28-Dec-1997 phk

More cleanup relating to our use of the TSC.
Look in the cpu_feature (CPUID output) to see if we have it.


# 32052 28-Dec-1997 phk

wash, sort and put in order various nits from the i586_ctr -> tsc
commit.

Pointed out by: bde


# 32005 26-Dec-1997 phk

Rename "i586_ctr" to "tsc" (both upper and lower case instances).
Fix a couple of printfs too.

Warning: This changes the names of a couple of kernel options!


# 31253 18-Nov-1997 bde

Removed #unused includes.

Added a used #include (don't depend on yet to be fixed namespace pollution).


# 30805 28-Oct-1997 bde

Don't include <machine/cputypes.h> or declare cputype/class interfaces
in <machine/cpu.h>. Moved the declarations to <machine/cputypes.h>.
Fixed style bugs in the moved code. Fixed everything that depended on
the nested include. Don't include <machine/cpu.h> (in the changed files)
unless something in it is used directly.


# 29000 01-Sep-1997 fsmp

General cleanup of the sub-system locking macros.
Eliminated the RECURSIVE_MPINTRLOCK.
clock.c and microtime use clock_lock.
sio.c and cy.c use com_lock.

Suggestions by: Bruce Evans <bde@zeta.org.au>


# 28921 30-Aug-1997 fsmp

Another round of lock pushdown.
Add a simplelock to deal with disable_intr()/enable_intr() as used in UP kernel.
UP kernel expects that this is enough to guarantee exclusive access to
regions of code bracketed by these 2 functions.
Add a simplelock to bracket clock accesses in clock.c: clock_lock.

Help from: Bruce Evans <bde@zeta.org.au>


# 28551 21-Aug-1997 bde

#include <machine/limits.h> explicitly in the few places that it is required.


# 28487 21-Aug-1997 fsmp

Made PEND_INTS default.
Made NEW_STRATEGY default.
Removed misc. old cruft.

Centralized simple locks into mp_machdep.c
Centralized simple lock macros into param.h

More cleanup in the direction of making splxx()/cpl MP-safe.


# 27696 25-Jul-1997 fsmp

clock.c:
- removed TEST_ALTTIMER.
- removed APIC_PIN0_TIMER.
- removed TIMER_ALL.

apic_vector.s:
- new algorithm where a CPU uses try_mplock instead of get_mplock:
if successful continue as before.
if fail set ipending bit, mask INT (to avoid recursion), cleanup & iret.

This allows the CPU to return to successful work, while the ISR will be run
by the CPU holding the lock as part of the doreti dance.


# 27616 22-Jul-1997 fsmp

Last commit didn't take, operator error???


# 27612 22-Jul-1997 fsmp

Major cleanup of APIC code around the imen variable.
This is the first step towards making INTREN()/INTRDIS() MP-safe.


# 27563 20-Jul-1997 fsmp

Developed a new strategy for handling the 8254/8259/APIC issue.


# 27560 20-Jul-1997 fsmp

Minor cleanup.


# 27555 20-Jul-1997 bde

Removed unused #includes.


# 27522 19-Jul-1997 fsmp

Added #code to support define APIC_PIN0_TIMER.
This code ALWAYS runs the 8254 timer thru the 8259 ICU.
It depricates the usage of "options SMP_TIMER_NC" in the config file.


# 27520 19-Jul-1997 fsmp

SMP or APIC_IO:
- Increased NIDT to 256.
- Moved IPI vectors up above the linux compat vector.
- Removed runtime setup of RTC vector.


# 27490 18-Jul-1997 fsmp

Made the printing of the APIC INTs depend on bootverbose.


# 27352 12-Jul-1997 fsmp

Cleanup old stop_cpus/restart_cpus() cruft.
new code for handling mixed-mode 8259/APIC programming without 'ExtInt'
new code to control other CPUs: stop_cpus()/restart_cpus()/_Xstopcpu


# 26949 25-Jun-1997 fsmp

Modified to use merged/renamed functions:

- get_isa_apic_mask() -> isa_apic_mask()
- get_isa_apic_irq() && get_eisa_apic_irq() -> isa_apic_pin()


# 26373 02-Jun-1997 dfr

Move interrupt handling code from isa.c to a new file. This should make
isa.c (slightly) more portable and will make my life developing the really
portable version much easier.

Reviewed by: peter, fsmp


# 26309 31-May-1997 peter

Include file updates.. <machine/spl.h> -> <machine/ipl.h>, add
<machine/ipl.h> to those files that were depending on getting SWI_*
implicitly via <machine/cpufunc.h>


# 26264 29-May-1997 peter

No longer need opt_smp.h here


# 26129 25-May-1997 fsmp

Made the array vec[] a global.
This allows the APIC code to reorder the vectors at runtime.


# 25485 05-May-1997 peter

correct the order of the variables
use #ifdef where possible instead of #if defined

Submitted by: the KNF police, ie: bde :-)


# 25457 04-May-1997 peter

Don't remove i586_ctr_freq from scope, leave it defined as zero. This
simplifies some assumptions and stops some code compile problems.

This should fix the compile hiccup in PR#3491, but smp kernel profiling
isn't likely to be fixed by this.


# 25164 26-Apr-1997 peter

Man the liferafts! Here comes the long awaited SMP -> -current merge!

There are various options documented in i386/conf/LINT, there is more to
come over the next few days.

The kernel should run pretty much "as before" without the options to
activate SMP mode.

There are a handful of known "loose ends" that need to be fixed, but
have been put off since the SMP kernel is in a moderately good condition
at the moment.

This commit is the result of the tinkering and testing over the last 14
months by many people. A special thanks to Steve Passe for implementing
the APIC code!


# 24676 06-Apr-1997 mckay

Prevent wedging of the stat clock because of missed interrupts.
This should cure the "alternate system clock has died!" problem.

Discussed with: bde, joerg


# 23393 05-Mar-1997 bde

Only print clock calibration messages if the system was booted with -v.

Submitted by: partly by gpalmer


# 23386 04-Mar-1997 gpalmer

Back out the patch to break up the clock probe lines. Instead, follow
Bruce's suggestion of deleting "relative to mc146818A clock ",
thus shortening the line ...


# 23375 04-Mar-1997 gpalmer

Split the rather long and line-wrapping clock probe messages on boot.
(2.2?)

Submitted by: Mathew Dood <winter@jurai.net>


# 22975 22-Feb-1997 peter

Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not
ready for it yet.


# 22106 29-Jan-1997 bde

Estimate an initial overhead of 0 usec instead of 20 usec in DELAY().
I have code to calibrate the overhead fairly accurately, but there
is little point in using it since it is most accurate on machines
where an estimate of 0 works well. On slow machines, the accuracy
of DELAY() has a large variance since it is limited by the resolution
of getit() even if the initial delay is calibrated perfectly.

Use fixed point and long longs to speed up scaling in DELAY().
The old method slowed down a lot when the frequency became variable.
Assume the default frequency for short delays so that the fixed
point calculation can be exact.

Fast scaling is only important for small delays. Scaling is done
after looking at the counter and outside the loop, so it doesn't
decrease accuracy or resolution provided it completes before the
delay is up. The comment in the code is still confused about this.


# 21783 16-Jan-1997 bde

Guard against the i8254 timer being uninitialzed if DELAY() is
called early for console i/o. The timer is usually in BIOS mode
if it isn't explicitly initialized. Then it counts twice as fast
and has a max count of 65535 instead of 11932. The larger count
tended to cause infinite loops for delays of > 20 us. Such delays
are rare. For syscons and kbdio, DELAY() is only called early
enough to matter for ddb input after booting with -d, and the delay
is too small to matter (and too small to be correct) except in the
PC98 case. For pcvt, DELAY() is not used for small delays (pcvt
uses its own broken routine instead of the standard broken one),
but some versions call DELAY() with a large arg when they unnecessarily
initialize the keyboard for doing console output. The problem is
more serious for pcvt because there is always some early console
output.

Guard against the i8254 timer being partially or incorrectly
initialized. This would have prevented the endless loop.

Should be in 2.2.


# 21673 14-Jan-1997 jkh

Make the long-awaited change from $Id$ to $FreeBSD$

This will make a number of things easier in the future, as well as (finally!)
avoiding the Id-smashing problem which has plagued developers for so long.

Boy, I'm glad we're not using sup anymore. This update would have been
insane otherwise.


# 19186 25-Oct-1996 bde

Removed initialization of a variable that went away. Oops.


# 19173 25-Oct-1996 bde

Print the clock calibration messages all on one (long) line again so
that they are easy to grep for.

Removed now-unused i586 counter variables.

Fixed some style bugs.


# 18842 09-Oct-1996 bde

Put I*86_CPU defines in opt_cpu.h.


# 18297 14-Sep-1996 bde

Attached simple external ddb commands `show rtc', `show pgrpdump'
and `show cbstat'. The pgrpdump code was previously controlled by
`#ifdef DEBUG'.


# 18288 14-Sep-1996 bde

Changed cncheckc() interface so that it is 8-bit clean - return -1
instead of 0 if there is no input.

syscons.c:
Added missing spl locking in sccncheckc(). Return the same value as
sccngetc() would. It is wrong for sccngetc() to return non-ASCII, but
stripping the non-ASCII bits doesn't help.


# 17395 02-Aug-1996 bde

Eliminated i586_ctr_rate. Use i586_ctr_freq instead.


# 17353 30-Jul-1996 bde

Fixed the machdep.i8254_freq and machdep.i586_freq sysctls. Writes were
handled bogusly.

Centralized the setting of all the frequency variables. Set these
variables atomically. Some new ones aren't used yet.


# 17236 21-Jul-1996 joerg

Post-commit review by Bruce. Mostly stylistic changes.

Submitted by: bde


# 17231 20-Jul-1996 joerg

Major cleanup of the timerX_{acquire,release} stuff. In particular,
make it more intelligible, improve the partially bogus locking, and
allow for a ``quick re-acquiration'' from a pending release of timer 0
that happened ``recently'', so it was not processed yet by clkintr().
This latter modification now finally allows to play XBoing over
pcaudio without losing sounds or getting complaints. ;-) (XBoing
opens/writes/closes the sound device all over the day.)

Correct locking for sysbeep().

Extensively (:-) reviewed by: bde


# 17194 17-Jul-1996 bde

Fixed adjustment of `time' when timer0 is released. 27465 was 27645 in
a comment and in code that was only used when pcaudio was closed. The
maximum error was 66 usec.


# 16874 01-Jul-1996 bde

Use the standard timer (interrupt) frequency while calibrating the clocks.
Testing with the high frequency of 20000 Hz (to find problems) only found
the problem that this frequency is too high for slow i386's.

Disable interrupts while setting the timer frequency. This was unnecessary
before rev.1.57 and forgotten in rev.1.57. The critical (i8254) interrupts
are disabled in another way at boot time but not in the sysctl to change
the frequency.


# 16428 17-Jun-1996 bde

In getit(), use read_eflags()/write_eflags() to preserve the interrupt
enable flag instead of enable_intr() to restore it to its usual state.
getit() is only called from DELAY() so there is no point in optimising
its speed (this wasn't so clear when it was extern), and using
enable_intr() made it inconvenient to call DELAY() from probes that need
to run with interrupts disabled.


# 16300 11-Jun-1996 pst

Move warning messages under bootverbose


# 16299 11-Jun-1996 pst

Put clock calibration #defines in opt_clock.h to ease reconfiguration


# 15508 01-May-1996 bde

Added calibration the i8254 and the i586 clocks agains the RTC at boot
time. The results are currently ignored unless certain temporary options
are used.

Added sysctls to support reading and writing the clock frequency variables
(not the frequencies themselves). Writing is supposed to atomically
adjust all related variables.

machdep.c:
Fixed spelling of a function name in a comment so that I can log this
message which should have been with the previous commit.

Initialize `cpu_class' earlier so that it can be used in startrtclock()
instead of in calibrate_cyclecounter() (which no longer exists).

Removed range checking of `cpu'. It is always initialized to CPU_XXX
so it is less likely to be out of bounds than most variables.

clock.h:
Removed I586_CYCLECTR(). Use rdtsc() instead.

clock.c:
TIMER_FREQ is now a variable timer_freq that defaults to the old value of
TIMER_FREQ. #define'ing TIMER_FREQ should still work and may be the best
way of setting the frequency.

Calibration involves counting cycles while watching the RTC for one second.
This gives values correct to within (a few ppm) + (the innaccuracy of the
RTC) on my systems.


# 15345 22-Apr-1996 nate

- add apm to the GENERIC kernel (disabled by default), and add some comments
regarding apm to LINT
- Disabled the statistics clock on machines which have an APM BIOS and
have the options "APM_BROKEN_STATCLOCK" enabled (which is default
in GENERIC now)
- move around some of the code in clock.c dealing with the rtc to make
it more obvios the effects of disabling the statistics clock

Reviewed by: bde


# 15054 05-Apr-1996 ache

Fix adjkerntz expression priority


# 15045 05-Apr-1996 ache

Add wall_cmos_clock sysctl variable, needed to manage adjkerntz even for
UTC cmos clocks (needed for Local Timezone FSes)


# 14943 31-Mar-1996 bde

Moved rtcin() to clock.c.

Always delay using one inb(0x84) after each i/o in rtcin() - don't
do this conditional on the bogus option DUMMY_NOPS not being defined.
If you want an optionally slightly faster rtcin() again, then inline
it and use a better named option or sysctl variable. It only needs
to be fast in rtcintr().


# 14773 23-Mar-1996 nate

Whoops, back out the last commit, which was accidentally committed at
the same time as the if_zp cleanup patch.

The commit that occurred was an incomplete patch for APM on my laptop
and needs more work.


# 14772 23-Mar-1996 nate

Now that ac->ac_ipaddr and arpwhohas() no longer exist, remove the
ifdef'd out code that used it.


# 13758 30-Jan-1996 wollman

No longer use the cyclecounter to attempt to correct for late or missed
clock interrupts.

Keep a 1-in-16 smoothed average of the length of each tick. If the
CPU speed is correctly diagnosed, this should give experienced users
enough information to figure out a more suitable value for `tick'.


# 13453 16-Jan-1996 ache

Since new bcd* macros not argument range overflow resistant,
fix argument overflow for years >= 2000


# 13445 15-Jan-1996 phk

My wife is busy making me a new conical hat, so you don't need to
send any to me this time. Commited an old copy of this files where
the tables were swapped. Duh!.


# 13444 15-Jan-1996 phk

Soren called an said that I screwed up badly, so I backup until
I find out how... Sorry.


# 13438 15-Jan-1996 phk

Make bin2bcd and bcd2bin global macroes instead of having local
implementations all over the place.


# 13402 12-Jan-1996 bde

Fixed handling of Feb 29 in resettodr().


# 13350 08-Jan-1996 ache

Replace ugly year/month calculations in resettodr to more clean
variants, idea taken from NetBSD clock.c.
At least year calculation was wrong, pointed by Bruce.
Use different strategy to store year for BIOS without RTC_CENTURY


# 13228 04-Jan-1996 wollman

Convert DDB to new-style option.


# 13000 24-Dec-1995 dg

Add Pentium Pro CPU detection and special handling. For now, all the
optimizations we have for 586s also apply to 686s...this will be fine-
tuned in the future as appropriate.


# 12941 20-Dec-1995 wollman

Increase Pentium cyclecounter calibration time to 131072 us. This
experimentally seems to give better results on my machine.


# 12844 14-Dec-1995 bde

Fixed staticization of DDB functions.


# 12724 10-Dec-1995 phk

Staticize and cleanup.


# 12533 29-Nov-1995 wollman

Fix Pentium CPU rate diagnosis:
- Don't print out meaningless iCOMP numbers, those are for droids.
- Use a shorter wait to determine clock rate to avoid deficiencies
in DELAY().
- Use a fixed-point representation with 8 bits of fraction to store
the rate and rationalize the variable name. It would be
possible to use even more fraction if it turns out to be
worthwhile (I rather doubt it).

The question of source code arrangement remains unaddressed.


# 11872 28-Oct-1995 phk

Remove unused functions and variables, make things static, and other cleanups.


# 11452 12-Oct-1995 wollman

Reduce jitter of Pentium microtime() implementation by letting the counter
free-run and doing a subtract in microtime() rather than resetting the
counter to zero at every clock tick. In combination with the changes to
kern_clock.c, this should eliminate all the immediately obvious sources
of systematic jitter in timekeeping on Pentium machines.


# 10268 25-Aug-1995 bde

Remove extra args from the calls to getit(). The bug was benign with the
default function call convention.


# 9202 11-Jun-1995 rgrimes

Merge RELENG_2_0_5 into HEAD


# 8876 30-May-1995 rgrimes

Remove trailing whitespace.


# 8448 11-May-1995 bde

Add variable `idelayed' and macros setdelayed() and schedsofttty()
to access it. setdelayed() actually ORs the bits in `idelayed' into
`ipending' and clears `idelayed'.

Call setdelayed() every (normal) clock tick to convert delayed
interrupts into pending ones.

Drivers can set bits in `idelayed' at any time to schedule an interrupt
at the next clock tick. This is more efficient than calling timeout().
Currently only software interrupts can be scheduled.


# 7090 16-Mar-1995 bde

Add and move declarations to fix all of the warnings from `gcc -Wimplicit'
(except in netccitt, netiso and netns) and most of the warnings from
`gcc -Wnested-externs'. Fix all the bugs found. There were no serious
ones.


# 5722 19-Jan-1995 ats

Submitted by: Bruce Evans
Put in the much shorter and cleaner version for the calibrate_cycle_counter
for the Pentium that Bruce suggested. Tested here on my Pentium and
it works okay.


# 5431 07-Jan-1995 ats

Work around a compiler bug in gcc2.6.3 in handling (long long) variables and
shifting. Also correct the original code as Garrett noticed it in mail.
Leave the mishandled code in to use it later if future versions of gcc
are correct. The code was part of the calibrate_cyclecounter routine to
get the speed of the pentium chip.


# 5291 30-Dec-1994 bde

icu.s:
Move definition of `stat_imask' to clock.c.

clock.c:
Rename `rtcmask' to `stat_imask' and export it. Rename `clkmask' to
`clk_imask' for consistency.

Only calculate TIMER_DIV(hz) once.

Merge debugging and "garbage" code to produce debugging code and format the
output better.

Make writertc() static inline and use it everywhere. Now all accesses to
the clock registers go through rtcin() and writertc().

Move rtc initialization to cpu_initclocks().

Merge enablertclock() with cpu_initclocks() and remove enablertclock().
The extra entry point was just a leftover from 1.1.5.


# 4396 12-Nov-1994 ache

Revision 1.6 fix was lost: don't write 0 to RTC_DIAG


# 4341 10-Nov-1994 ache

Use adjkerntz into inittodr too (for APM stuff)


# 4180 05-Nov-1994 bde

Maintain a new variable `timer0_overflow_threshold' so that microtime()
doesn't have to calculate it every call.

Rename `timer0_prescale' to `timer0_prescaler_count' and maintain it
correctly. Previously we lost a few 8253 cycles for every "prescaled"
clock interrupt, and the lossage grows rapidly at 16 KHz. Now we
only lose a few cycles for every standard clock interrupt.

Rename `*_divisor' to `*_max_count'.

Do the calculation of TIMER_DIV(rate) only once instead of 3 times each
time the rate is changed.

Don't allow preposterously large interrupt rates. Bug fixes elsewhere
should allow the system to survive rates that saturate the system, however.

Clean up declarations.

Include <machine/clock.h> to check our own declarations.


# 3867 25-Oct-1994 se

BEWARE: Interface change of register_intr() !

Changed the fifth parameter to register_intr() from u_int mask into
u_int *maskptr in preparation for new features (shared interrupts and
removable devices, eg. for PCMCIA).


# 3366 04-Oct-1994 ache

Add disable_rtc_set variable to block resettodr() call, needed for
adjkerntz -i, per Bruce suggestion


# 3355 04-Oct-1994 ache

RTC_CENTURY usage ifdefed out by USE_RTC_CENTURY compile option,
pointed by Bruce


# 3185 29-Sep-1994 sos

Updated pcaudio.c to latest from 1.1.5.1
Enabled timer reprogramming in clock.c (this could use more work).

Obtained from: FreeBSD-1.1.5.1


# 2932 20-Sep-1994 bde

Don't lose the RTC interrupt in resettodr().


# 2913 19-Sep-1994 ache

resettodr() implemented, inittodr() fixed
Submitted by: me & chris@gnome.co.uk


# 2873 18-Sep-1994 bde

Remove some unnecessary #includes.

Restore the simple leap year calculation as a macro and document it so
that it doesn't become complicated again. The simple version works
for all leap years covered by 32-bit time_t's. The complicated version
doesn't work for all leap years covered by 64-bit time_t's since among
other reasons, the solar system is not stable for long enough.

Fix declarations.

Nuke spinwait().


# 2858 18-Sep-1994 wollman

Redo Kernel NTP PLL support, kernel side.

This code is mostly taken from the 1.1 port (which was in turn taken from
Dave Mills's kern.tar.Z example). A few significant differences:

1) ntp_gettime() is now a MIB variable rather than a system call. A few
fiddles are done in libc to make it behave the same.

2) mono_time does not participate in the PLL adjustments.

3) A new interface has been defined (in <machine/clock.h>) for doing
possibly machine-dependent things around the time of the clock update.
This is used in Pentium kernels to disable interrupts, set `time', and
reset the CPU cycle counter as quickly as possible to avoid jitter in
microtime(). Measurements show an apparent resolution of a bit more than
8.14usec, which is reasonable given system-call overhead.


# 2770 14-Sep-1994 ache

1. adjkerntz variable added for preparation to resettodr() implementation
2. Leap year calculations fixed


# 2112 18-Aug-1994 wollman

Fix up some sloppy coding practices:

- Delete redundant declarations.
- Add -Wredundant-declarations to Makefile.i386 so they don't come back.
- Delete sloppy COMMON-style declarations of uninitialized data in
header files.
- Add a few prototypes.
- Clean up warnings resulting from the above.

NB: ioconf.c will still generate a redundant-declaration warning, which
is unavoidable unless somebody volunteers to make `config' smarter.


# 2103 18-Aug-1994 dg

Bruce Evans' dynamic interrupt support.

/usr/src/sys/i386/isa/clock.c:
o Garrett's statclock changes.
o Wire xxxintr, not Vclk.
o Wire using register_intr(), not setidt().

/usr/src/sys/i386/isa/icu.s:
o Garrett's statclock changes.
o Removed unused variable high_imask.
o Fake int 8 for rtc as well as int 0 for clk. Required for kernel
profiling with statclock, harmless otherwise.

/usr/src/sys/i386/isa/isa.c:
o Allow isdp->id_irq and other things in *isdp to be changed by
probes. Changing interrupts later requires direct calls to
register_intr() and unregister_intr() and more care.
ALLOW_CONFLICT_* is brought over from 1.1.5, except
ALLOW_CONFLICT_IRQ is not supported. IRQ conflict checking is
delayed until after probing so that drivers can change the IRQ
to a free one; real conflicts require more cooperation between
drivers to handle.
o Too many details to list.
o This file requires splitting and a lot more work.

/usr/src/sys/i386/isa/isa_device.h:
o Declare more things more completely.

/usr/src/sys/i386/isa/sio.c:
o Prepare to register interrupt handlers as fast.

/usr/src/sys/i386/isa/vector.s:
o Generate entry code for 16 fast interrupt handlers and 16 normal
interrupt handlers. Changed some constants to variables:
# $unit is now intr_unit[intr]. Type is int. Someday it should
be a cookie suitable for the handler (e.g., a struct com_s for
sio).
# $handler is now intr_handler[intr].
# intrcnt_actv[id_num] is now *intr_countp[intr]. The indirection
is required to get a contiguous range of counters for vmstat
and so that the drivers depend more in the driver than on the
interrupt number (drivers could take turns using an interrupt
and the counts would remain correct). There is a separate
counter for each device and for each stray interrupt. In
1.1.5, stray interrupt 7 clobbers the count for device 7 or
something worse if there is no device 7 :-(.
# mask is now intr_mask[intr] (was already indirect).
o Entry points are now _XintrI and _XfastintrI (I = intr = 0-15),
not _VdevU (U = unit).
o Removed BUILD_VECTORS stuff. There's a trace of it left for
the string table for vmstat but config now generates the
string in one piece because nothing more is required.
o Removed old handling of stray interrupts and older comments
about it.

Submitted by: Bruce Evans


# 2074 15-Aug-1994 wollman

Enable use of the RTC chip for the statistical clock. While this does
not provide the full accuracy of a randomized statistical clock, it does
provide greater accuracy than the previous method, while not significantly
increasing overhead. It also provides profiling support at 1024 Hz.

You must re-compile config before making a new kernel, or you will end
up with unresolved symbols.

Reviewed uy: Bruce evans said it worked for him.


# 2056 13-Aug-1994 wollman

Change all #includes to follow the current Berkeley style. Some of these
``changes'' are actually not changes at all, but CVS sometimes has trouble
telling the difference.

This also includes support for second-directory compiles. This is not
quite complete yet, as `config' doesn't yet do the right thing. You can
still make it work trivially, however, by doing the following:

rm /sys/compile
mkdir /usr/obj/sys/compile
ln -s M-. /sys/compile
cd /sys/i386/conf
config MYKERNEL
cd ../../compile/MYKERNEL
ln -s /sys @
rm machine
ln -s @/i386/include machine
make depend
make


# 2017 10-Aug-1994 wollman

For Pentium machines, use a faster version of microtime with 8 usec
resolution (can probably be improved somewhat). Other machines take
a three-instruction hit if I586_CPU is defined, none otherwise.


# 2014 10-Aug-1994 wollman

Tell Pentium users their CPU speed. (More changes to make use of this
to come later.)


# 1549 25-May-1994 rgrimes

The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.

Reviewed by: Rodney W. Grimes
Submitted by: John Dyson and David Greenman


# 1442 02-May-1994 sos

Update the reprogram timer stuff, now the frequency of timer 0
can only be changed at the "right" times. Accuracy should be
assured.


# 1407 23-Apr-1994 wollman

Define new option, INACCURATE_MICROTIME_IS_OK. When this is defined,
the NTP kernel PLL is disabled, and acquire_timer0() is enabled, thus
opening the door for microtime() (and hence gettimeofday()) to return
bogus timestamps. This option is necessary for the `pca' driver to
work, but is implemented to underscore the fact that accurate timekeeping
and the `pca' driver are incompatible at present. If someone writes a version
of microtime() that works when the `pca' driver is being used, this can get
junked.


# 1390 21-Apr-1994 sos

New support for sharing the timers
acquire_timer / release_timer

Pulled in timer related functions from isa.c


# 1104 06-Feb-1994 dg

At the suggestion of Bruce Evans, don't zero RTC diag register. Doing so
was causing problems for some machines.


# 879 18-Dec-1993 wollman

Make everything compile with -Wtraditional. Make it easier to distribute
a binary link-kit. Make all non-optional options (pagers, procfs) standard,
and update LINT to reflect new symtab requirements.

NB: -Wtraditional will henceforth be forgotten. This editing pass was
primarily intended to detect any constructions where the old code might
have been relying on traditional C semantics or syntax. These were all
fixed, and the result of fixing some of them means that -Wall is now a
realistic possibility within a few weeks.


# 798 24-Nov-1993 wollman

Make the LINT kernel compile with -W -Wreturn-type -Wcomment -Werror, and
add same (sans -Werror) to Makefile for future compilations.


# 700 03-Nov-1993 ache

DST offset calculation removed, it is wrong in any case.


# 619 16-Oct-1993 rgrimes

Removed all patch kit headers, sccsid and rcsid strings, put $Id$ in, some
minor cleanup. Added $Id$ to files that did not have any version info, etc


# 5 12-Jun-1993 rgrimes

This commit was generated by cvs2svn to compensate for changes in r4,
which included commits to RCS files with non-trunk default branches.


# 4 12-Jun-1993 rgrimes

Initial import, 0.1 + pk 0.2.4-B1