#
352383 |
|
16-Sep-2019 |
kib |
MFC r352059, r352060: Make timehands count selectable at boottime.
|
#
327196 |
|
26-Dec-2017 |
kib |
MFC r326973: Use atomic_load(9) to read ppsinfo sequence numbers.
|
#
324718 |
|
18-Oct-2017 |
kib |
MFC r324528: In tc_windup(), do not re-calculate bintime.
|
#
316120 |
|
29-Mar-2017 |
vangyzen |
MFC r315280 r315287
When the RTC is adjusted, reevaluate absolute sleep times based on the RTC
POSIX 2008 says this about clock_settime(2):
If the value of the CLOCK_REALTIME clock is set via clock_settime(), the new value of the clock shall be used to determine the time of expiration for absolute time services based upon the CLOCK_REALTIME clock. This applies to the time at which armed absolute timers expire. If the absolute time requested at the invocation of such a time service is before the new value of the clock, the time service shall expire immediately as if the clock had reached the requested time normally.
Setting the value of the CLOCK_REALTIME clock via clock_settime() shall have no effect on threads that are blocked waiting for a relative time service based upon this clock, including the nanosleep() function; nor on the expiration of relative timers based upon this clock. Consequently, these time services shall expire when the requested relative interval elapses, independently of the new or old value of the clock.
When the real-time clock is adjusted, such as by clock_settime(3), wake any threads sleeping until an absolute real-clock time. Such a sleep is indicated by a non-zero td_rtcgen. The sleep functions will set that field to zero and return zero to tell the caller to reevaluate its sleep duration based on the new value of the clock.
At present, this affects the following functions:
pthread_cond_timedwait(3) pthread_mutex_timedlock(3) pthread_rwlock_timedrdlock(3) pthread_rwlock_timedwrlock(3) sem_timedwait(3) sem_clockwait_np(3)
I'm working on adding clock_nanosleep(2), which will also be affected.
Reported by: Sebastian Huber <sebastian.huber@embedded-brains.de> Relnotes: yes Sponsored by: Dell EMC
|
#
305866 |
|
16-Sep-2016 |
kib |
MFC r304285: Implement userspace gettimeofday(2) with HPET timecounter.
|
#
304974 |
|
28-Aug-2016 |
kib |
MFC r303548: Cache getbintime(9) answer in timehands.
|
#
304843 |
|
26-Aug-2016 |
kib |
MFC r303382: Provide the getboottime(9) and getboottimebin(9) KPI.
MFC r303387: Prevent parallel tc_windup() calls. Keep boottime in timehands, and adjust it from tc_windup().
MFC notes:
The boottime and boottimebin globals are still exported from the kernel dyn symbol table in stable/11, but their declarations are removed from sys/time.h. This preserves KBI but not KPI, while all in-tree consumers are converted to getboottime().
The variables are updated after tc_setclock_mtx is dropped, which gives approximately same unlocked bugs as before.
The boottime and boottimebin locals in several sys/kern_tc.c functions were renamed by adding the '_x' suffix to avoid name conficts.
|
#
304840 |
|
26-Aug-2016 |
kib |
MFC r303384: Style.
|
#
304839 |
|
26-Aug-2016 |
kib |
MFC r303383: Reduce number of timehands to just two.
|
#
302408 |
|
07-Jul-2016 |
gjb |
Copy head@r302406 to stable/11 as part of the 11.0-RELEASE cycle. Prune svn:mergeinfo from the new branch, as nothing has been merged here.
Additional commits post-branch will follow.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
298819 |
|
29-Apr-2016 |
pfg |
sys/kern: spelling fixes in comments.
No functional change.
|
#
290257 |
|
02-Nov-2015 |
ngie |
Define `fhard` in pps_event(..) only when PPS_SYNC is defined to mute an -Wunused-but-set-variable warning
Reported by: FreeBSD_HEAD_amd64_gcc4.9 jenkins job Sponsored by: EMC / Isilon Storage Division
|
#
288216 |
|
25-Sep-2015 |
kib |
Use per-cpu values for base and last in tc_cpu_ticks(). The values are updated lockess, different CPUs write its own view of timecounter state. The critical section is done for safety, callers of tc_cpu_ticks() are supposed to already enter critical section, or to own a spinlock.
The change fixes sporadical reports of too high values reported for the (W)CPU on platforms that do not provide cpu ticker and use tc_cpu_ticks(), in particular, arm*.
Diagnosed and reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
286701 |
|
12-Aug-2015 |
ian |
If a specific timecounter has been chosen via sysctl, and a new timecounter with higher quality registers (presumably in a module that has just been loaded), do not undo the user's choice by switching to the new timecounter.
Document that behavior, and also the fact that there is no way to unregister a timecounter (and thus no way to unload a module containing one).
|
#
286429 |
|
07-Aug-2015 |
ian |
Only process the PPS event types currently enabled in pps_params.mode.
This makes the PPS API behave correctly, but isn't ideal -- we still end up capturing PPS data for non-enabled edges, we just don't process the data into an event that becomes visible outside of kern_tc. That's because the event type isn't passed to pps_capture(), so it can't do the filtering. Any solution for capture filtering is going to require touching every driver.
|
#
286423 |
|
07-Aug-2015 |
ian |
RFC 2783 requires a status of ETIMEDOUT, not EWOULDBLOCK, on a timeout.
|
#
285286 |
|
08-Jul-2015 |
kib |
Reimplement the ordering requirements for the timehands updates, and for timehands consumers, by using fences.
Ensure that the timehands->th_generation reset to zero is visible before the data update is visible [*]. tc_setget() allowed data update writes to become visible before generation (but not on TSO architectures).
Remove tc_setgen(), tc_getgen() helpers, use atomics inline [**].
Noted by: alc [*] Requested by: bde [**] Reviewed by: alc, bde Sponsored by: The FreeBSD Foundation MFC after: 3 weeks
|
#
284256 |
|
11-Jun-2015 |
kib |
Tweaks for r284178:
Do not include machine/atomic.h explicitely, the header is already included by sys/systm.h.
Force inlining of tc_getgen() and tc_setgen(). The functions are used more than once, which causes compilers with non-aggressive inlining policies to generate calls.
Suggested by: bde Sponsored by: The FreeBSD Foundation MFC after: 1 week
|
#
284178 |
|
09-Jun-2015 |
kib |
When updating/accessing the timehands, barriers are needed to ensure that: - th_generation update is visible after the parameters update is visible; - the read of parameters is not reordered before initial read of th_generation.
On UP kernels, compiler barriers are enough. For SMP machines, CPU barriers must be used too, as was confirmed by submitter by testing on the Freescale T4240 platform with 24 PowerPC processors.
Submitted by: Sebastian Huber <sebastian.huber@embedded-brains.de> MFC after: 1 week
|
#
282424 |
|
04-May-2015 |
ian |
Implement a mechanism for making changes in the kernel<->driver PPS interface without breaking ABI or API compatibility with existing drivers.
The existing data structures used to communicate between the kernel and driver portions of PPS processing contain no spare/padding fields and no flags field or other straightforward mechanism for communicating changes in the structures or behaviors of the code. This makes it difficult to MFC new features added to the PPS facility. ABI compatibility is important; out-of-tree drivers in module form are known to exist. (Note that the existing api_version field in the pps_params structure must contain the value mandated by RFC 2783 and any RFCs that come along after.)
These changes introduce a pair of abi-version fields which are filled in by the driver and the kernel respectively to indicate the interface version. The driver sets its version field before calling the new pps_init_abi() function. That lets the kernel know how much of the pps_state structure is understood by the driver and it can avoid using newer fields at the end of the structure that it knows about if the driver is a lower version. The kernel fills in its version field during the init call, letting the driver know what features and data the kernel supports.
To implement the new version information in a way that is backwards compatible with code from before these changes, the high bit of the lightly-used 'kcmode' field is repurposed as a flag bit that indicates the driver is aware of the abi versioning scheme. Basically if this bit is clear that indicates a "version 0" driver and if it is set the driver_abi field indicates the version.
These changes also move the recently-added 'mtx' field of pps_state from the middle to the end of the structure, and make the kernel code that uses this field conditional on the driver being abi version 1 or higher. It changes the only driver currently supplying the mtx field, usb_serial, to use pps_init_abi().
Reviewed by: hselasky@
|
#
280012 |
|
14-Mar-2015 |
ian |
Use sbuf_printf() for sysctl strings instead of stack buffers and snprintf().
|
#
279728 |
|
07-Mar-2015 |
hselasky |
Add mutex support to the pps_ioctl() API in the kernel. Bump kernel version to reflect structure change.
PR: 196897 MFC after: 1 week
|
#
277406 |
|
20-Jan-2015 |
neel |
Update the vdso timehands only via tc_windup().
Prior to this change CLOCK_MONOTONIC could go backwards when the timecounter hardware was changed via 'sysctl kern.timecounter.hardware'. This happened because the vdso timehands update was missing the special treatment in tc_windup() when changing timecounters.
Reviewed by: kib
|
#
276724 |
|
05-Jan-2015 |
jhb |
On some Intel CPUs with a P-state but not C-state invariant TSC the TSC may also halt in C2 and not just C3 (it seems that in some cases the BIOS advertises its C3 state as a C2 state in _CST). Just play it safe and disable both C2 and C3 states if a user forces the use of the TSC as the timecounter on such CPUs.
PR: 192316 Differential Revision: https://reviews.freebsd.org/D1441 No objection from: jkim MFC after: 1 week
|
#
267992 |
|
28-Jun-2014 |
hselasky |
Pull in r267961 and r267973 again. Fix for issues reported will follow.
|
#
267985 |
|
27-Jun-2014 |
gjb |
Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output, such as:
1) no output from sysctl(8) 2) erroneously returning ENOMEM with tools like truss(1) or uname(1) truss: can not get etype: Cannot allocate memory
|
#
267961 |
|
27-Jun-2014 |
hselasky |
Extend the meaning of the CTLFLAG_TUN flag to automatically check if there is an environment variable which shall initialize the SYSCTL during early boot. This works for all SYSCTL types both statically and dynamically created ones, except for the SYSCTL NODE type and SYSCTLs which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to be used in the case a tunable sysctl has a custom initialisation function allowing the sysctl to still be marked as a tunable. The kernel SYSCTL API is mostly the same, with a few exceptions for some special operations like iterating childrens of a static/extern SYSCTL node. This operation should probably be made into a factored out common macro, hence some device drivers use this. The reason for changing the SYSCTL API was the need for a SYSCTL parent OID pointer and not only the SYSCTL parent OID list pointer in order to quickly generate the sysctl path. The motivation behind this patch is to avoid parameter loading cludges inside the OFED driver subsystem. Instead of adding special code to the OFED driver subsystem to post-load tunables into dynamically created sysctls, we generalize this in the kernel.
Other changes: - Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask" to "hw.pcic.intr_mask". - Removed redundant TUNABLE statements throughout the kernel. - Some minor code rewrites in connection to removing not needed TUNABLE statements. - Added a missing SYSCTL_DECL(). - Wrapped two very long lines. - Avoid malloc()/free() inside sysctl string handling, in case it is called to initialize a sysctl from a tunable, hence malloc()/free() is not ready when sysctls from the sysctl dataset are registered. - Bumped FreeBSD version to indicate SYSCTL API change.
MFC after: 2 weeks Sponsored by: Mellanox Technologies
|
#
247777 |
|
04-Mar-2013 |
davide |
- Make callout(9) tickless, relying on eventtimers(4) as backend for precise time event generation. This greatly improves granularity of callouts which are not anymore constrained to wait next tick to be scheduled. - Extend the callout KPI introducing a set of callout_reset_sbt* functions, which take a sbintime_t as timeout argument. The new KPI also offers a way for consumers to specify precision tolerance they allow, so that callout can coalesce events and reduce number of interrupts as well as potentially avoid scheduling a SWI thread. - Introduce support for dispatching callouts directly from hardware interrupt context, specifying an additional flag. This feature should be used carefully, as long as interrupt context has some limitations (e.g. no sleeping locks can be held). - Enhance mechanisms to gather informations about callwheel, introducing a new sysctl to obtain stats.
This change breaks the KBI. struct callout fields has been changed, in particular 'int ticks' (4 bytes) has been replaced with 'sbintime_t' (8 bytes) and another 'sbintime_t' field was added for precision.
Together with: mav Reviewed by: attilio, bde, luigi, phk Sponsored by: Google Summer of Code 2012, iXsystems inc. Tested by: flo (amd64, sparc64), marius (sparc64), ian (arm), markj (amd64), mav, Fabian Keil
|
#
246845 |
|
15-Feb-2013 |
ian |
Add PPS_CANWAIT support for time_pps_fetch(). This adds support for all three blocking modes described in section 3.4.3 of RFC 2783, allowing the caller to retrieve the most recent values without blocking, to block for a specified time, or to block forever.
Reviewed by: discussion on hackers@
|
#
246037 |
|
28-Jan-2013 |
jhb |
Mark 'ticks', 'time_second', and 'time_uptime' as volatile to prevent the compiler from caching their values in tight loops.
Reviewed by: bde MFC after: 1 week
|
#
238537 |
|
16-Jul-2012 |
gnn |
Add support for walltimestamp in DTrace.
Submitted by: Fabian Keil MFC after: 2 weeks
|
#
237474 |
|
23-Jun-2012 |
kib |
Stop updating the struct vdso_timehands from even handler executed in the scheduled task from tc_windup(). Do it directly from tc_windup in interrupt context [1].
Establish the permanent mapping of the shared page into the kernel address space, avoiding the potential need to sleep waiting for allocation of sf buffer during vdso_timehands update. As a consequence, shared_page_write_start() and shared_page_write_end() functions are not needed anymore.
Guess and memorize the pointers to native host and compat32 sysentvec during initialization, to avoid the need to get shared_page_alloc_sx lock during the update.
In tc_fill_vdso_timehands(), do not loop waiting for timehands generation to stabilize, since vdso_timehands is written in the same interrupt context which wrote timehands.
Requested by: mav [1] MFC after: 29 days
|
#
237433 |
|
22-Jun-2012 |
kib |
Implement mechanism to export some kernel timekeeping data to usermode, using shared page. The structures and functions have vdso prefix, to indicate the intended location of the code in some future.
The versioned per-algorithm data is exported in the format of struct vdso_timehands, which mostly repeats the content of in-kernel struct timehands. Usermode reading of the structure can be lockless. Compatibility export for 32bit processes on 64bit host is also provided. Kernel also provides usermode with indication about currently used timecounter, so that libc can fall back to syscall if configured timecounter is unknown to usermode code.
The shared data updates are initiated both from the tc_windup(), where a fast task is queued to do the update, and from sysctl handlers which change timecounter. A manual override switch kern.timecounter.fast_gettime allows to turn off the mechanism.
Only x86 architectures export the real algorithm data, and there, only for tsc timecounter. HPET counters page could be exported as well, but I prefer to not further glue the kernel and libc ABI there until proper vdso-based solution is developed.
Minimal stubs neccessary for non-x86 architectures to still compile are provided.
Discussed with: bde Reviewed by: jhb Tested by: flo MFC after: 1 month
|
#
232449 |
|
03-Mar-2012 |
jmallett |
o) Add COMPAT_FREEBSD32 support for MIPS kernels using the n64 ABI with userlands using the o32 ABI. This mostly follows nwhitehorn's lead in implementing COMPAT_FREEBSD32 on powerpc64. o) Add a new type to the freebsd32 compat layer, time32_t, which is time_t in the 32-bit ABI being used. Since the MIPS port is relatively-new, even the 32-bit ABIs use a 64-bit time_t. o) Because time{spec,val}32 has the same size and layout as time{spec,val} on MIPS with 32-bit compatibility, then, disable some code which assumes otherwise wrongly when built for MIPS. A more general macro to check in this case would seem like a good idea eventually. If someone adds support for using n32 userland with n64 kernels on MIPS, then they will have to add a variety of flags related to each piece of the ABI that can vary. That's probably the right time to generalize further. o) Add MIPS to the list of architectures which use PAD64_REQUIRED in the freebsd32 compat code. Probably this should be generalized at some point.
Reviewed by: gonzo
|
#
231341 |
|
10-Feb-2012 |
kevlo |
Add a missing break. This bug was introduced in r228856.
|
#
228856 |
|
23-Dec-2011 |
lstewart |
Introduce the sysclock_getsnapshot() and sysclock_snap2bintime() KPIs. The sysclock_getsnapshot() function allows the caller to obtain a snapshot of all the system clock and timecounter state required to create time stamps at a later point. The sysclock_snap2bintime() function converts a previously obtained snapshot into a bintime time stamp according to the specified flags e.g. which system clock, uptime vs absolute time, etc.
These KPIs enable useful functionality, including direct comparison of the feedback and feed-forward system clocks and generation of multiple time stamps with different formats from a single timecounter read.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
In collaboration with: Julien Ridoux (jridoux at unimelb edu au)
|
#
228123 |
|
29-Nov-2011 |
lstewart |
Do away with the somewhat clunky sysclock_ops structure and associated code, reimplementing the [get]{bin,nano,micro}[up]time() wrapper functions in terms of the new "fromclock" API instead.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Discussed with: Julien Ridoux (jridoux at unimelb edu au) Submitted by: Julien Ridoux (jridoux at unimelb edu au)
|
#
228117 |
|
29-Nov-2011 |
lstewart |
Make the fbclock_[get]{bin,nano,micro}[up]time() function prototypes public so that new APIs with some performance sensitivity can be built on top of them. These functions should not be called directly except in special circumstances.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Discussed with: Julien Ridoux (jridoux at unimelb edu au) Submitted by: Julien Ridoux (jridoux at unimelb edu au)
|
#
228115 |
|
29-Nov-2011 |
lstewart |
Fix an oversight in r227747 by calling fbclock_bin{up}time() directly from the fbclock_{nanouptime|microuptime|bintime|nanotime|microtime}() functions to avoid indirecting through a sysclock_ops wrapper function.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
|
#
227789 |
|
21-Nov-2011 |
lstewart |
- Add Pulse-Per-Second timestamping using raw ffcounter and corresponding ffclock time in seconds.
- Add IOCTL to retrieve ffclock timestamps from userland.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
|
#
227747 |
|
20-Nov-2011 |
lstewart |
- Provide a sysctl interface to change the active system clock at runtime.
- Wrap [get]{bin,nano,micro}[up]time() functions of sys/time.h to allow requesting time from either the feedback or the feed-forward clock. If a feedback (e.g. ntpd) and feed-forward (e.g. radclock) daemon are both running on the system, both kernel clocks are updated but only one serves time.
- Add similar wrappers for the feed-forward difference clock.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
|
#
227723 |
|
19-Nov-2011 |
lstewart |
Core structure and functions to support a feed-forward clock within the kernel. Implement ffcounter, a monotonically increasing cumulative counter on top of the active timecounter. Provide low-level functions to read the ffcounter and convert it to absolute time or a time interval in seconds using the current ffclock estimates, which track the drift of the oscillator. Add a ring of fftimehands to track passing of time on each kernel tick and pick up updates of ffclock estimates.
Committed on behalf of Julien Ridoux and Darryl Veitch from the University of Melbourne, Australia, as part of the FreeBSD Foundation funded "Feed-Forward Clock Synchronization Algorithms" project.
For more information, see http://www.synclab.org/radclock/
Submitted by: Julien Ridoux (jridoux at unimelb edu au)
|
#
227309 |
|
07-Nov-2011 |
ed |
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
#
224042 |
|
14-Jul-2011 |
jkim |
If TSC stops ticking in C3, disable deep sleep when the user forcefully select TSC as timecounter hardware.
Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de)
|
#
217616 |
|
19-Jan-2011 |
mdf |
Introduce signed and unsigned version of CTLTYPE_QUAD, renaming existing uses. Rename sysctl_handle_quad() to sysctl_handle_64().
|
#
215732 |
|
23-Nov-2010 |
cperciva |
Add parentheses for clarity. The parentheses around the two terms of the && are unnecessary but I'm leaving them in for the sake of avoiding confusion (I confuse easily).
Submitted by: bde
|
#
215665 |
|
22-Nov-2010 |
cperciva |
In tc_windup, handle the case where the previous call to tc_windup was more than 1s earlier. Prior to this commit, the computation of th_scale * delta (which produces a 64-bit value equal to the time since the last tc_windup call in units of 2^(-64) seconds) would overflow and any complete seconds would be lost.
We fix this by repeatedly converting tc_frequency units of timecounter to one seconds; this is not exactly correct, since it loses the NTP adjustment, but if we find ourselves going more than 1s at a time between clock interrupts, losing a few seconds worth of NTP adjustments is the least of our problems...
|
#
215304 |
|
14-Nov-2010 |
brucec |
Fix some more style(9) issues.
|
#
215283 |
|
14-Nov-2010 |
brucec |
Fix style(9) issues from r215281 and r215282.
MFC after: 1 week
|
#
215281 |
|
14-Nov-2010 |
brucec |
Add some descriptions to sys/kern sysctls.
PR: kern/148710 Tested by: Chip Camden <sterling at camdensoftware.com> MFC after: 1 week
|
#
212958 |
|
21-Sep-2010 |
mav |
Until hardclock() and respectively tc_windup() called first time, system is running on "dummy" time counter. But to function properly in one-shot mode, event timer management code requires working time counter. Slow moving "dummy" time counter delays first hardclock() call by few seconds on my systems, even though timer interrupts were correctly kicking kernel. That causes few seconds delay during boot with one-shot mode enabled.
To break this loop, explicitly call tc_windup() first time during initialization process to let it switch to some real time counter.
|
#
212603 |
|
14-Sep-2010 |
mav |
Make kern_tc.c provide minimum frequency of tc_ticktock() calls, required to handle current timecounter wraps. Make kern_clocksource.c to honor that requirement, scheduling sleeps on first CPU for no more then specified period. Allow other CPUs to sleep up to 1/4 second (for any case).
|
#
212541 |
|
13-Sep-2010 |
mav |
Refactor timer management code with priority to one-shot operation mode. The main goal of this is to generate timer interrupts only when there is some work to do. When CPU is busy interrupts are generating at full rate of hz + stathz to fullfill scheduler and timekeeping requirements. But when CPU is idle, only minimum set of interrupts (down to 8 interrupts per second per CPU now), needed to handle scheduled callouts is executed. This allows significantly increase idle CPU sleep time, increasing effect of static power-saving technologies. Also it should reduce host CPU load on virtualized systems, when guest system is idle.
There is set of tunables, also available as writable sysctls, allowing to control wanted event timer subsystem behavior: kern.eventtimer.timer - allows to choose event timer hardware to use. On x86 there is up to 4 different kinds of timers. Depending on whether chosen timer is per-CPU, behavior of other options slightly differs. kern.eventtimer.periodic - allows to choose periodic and one-shot operation mode. In periodic mode, current timer hardware taken as the only source of time for time events. This mode is quite alike to previous kernel behavior. One-shot mode instead uses currently selected time counter hardware to schedule all needed events one by one and program timer to generate interrupt exactly in specified time. Default value depends of chosen timer capabilities, but one-shot mode is preferred, until other is forced by user or hardware. kern.eventtimer.singlemul - in periodic mode specifies how much times higher timer frequency should be, to not strictly alias hardclock() and statclock() events. Default values are 2 and 4, but could be reduced to 1 if extra interrupts are unwanted. kern.eventtimer.idletick - makes each CPU to receive every timer interrupt independently of whether they busy or not. By default this options is disabled. If chosen timer is per-CPU and runs in periodic mode, this option has no effect - all interrupts are generating.
As soon as this patch modifies cpu_idle() on some platforms, I have also refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions (if supported) under high sleep/wakeup rate, as fast alternative to other methods. It allows SMP scheduler to wake up sleeping CPUs much faster without using IPI, significantly increasing performance on some highly task-switching loads.
Tested by: many (on i386, amd64, sparc64 and powerc) H/W donated by: Gheorghe Ardelean Sponsored by: iXsystems, Inc.
|
#
210226 |
|
18-Jul-2010 |
trasz |
Revert r210225 - turns out I was wrong; the "/*-" is not license-only thing; it's also used to indicate that the comment should not be automatically rewrapped.
Explained by: cperciva@
|
#
210225 |
|
18-Jul-2010 |
trasz |
The "/*-" comment marker is supposed to denote copyrights. Remove non-copyright occurences from sys/sys/ and sys/kern/.
|
#
209900 |
|
11-Jul-2010 |
mav |
Remove interval validation from cpu_tick_calibrate(). As I found, check was needed at preliminary version of the patch, where number of CPU ticks was divided strictly on 16 seconds. Final code instead uses real interval duration, so precise interval should not be important. Same time aliasing issues around second boundary causes false positives, periodically logging useless "t_delta ... too long/short" messages when HZ set below 256.
|
#
209390 |
|
21-Jun-2010 |
ed |
Use ISO C99 integer types in sys/kern where possible.
There are only about 100 occurences of the BSD-specific u_int*_t datatypes in sys/kern. The ISO C99 integer types are used here more often.
|
#
209216 |
|
15-Jun-2010 |
jkim |
Implement flexible BPF timestamping framework.
- Allow setting format, resolution and accuracy of BPF time stamps per listener. Previously, we were only able to use microtime(9). Now we can set various resolutions and accuracies with ioctl(2) BIOCSTSTAMP command. Similarly, we can get the current resolution and accuracy with BIOCGTSTAMP command. Document all supported options in bpf(4) and their uses.
- Introduce new time stamp 'struct bpf_ts' and header 'struct bpf_xhdr'. The new time stamp has both 64-bit second and fractional parts. bpf_xhdr has this time stamp instead of 'struct timeval' for bh_tstamp. The new structures let us use bh_tstamp of same size on both 32-bit and 64-bit platforms without adding additional shims for 32-bit binaries. On 64-bit platforms, size of BPF header does not change compared to bpf_hdr as its members are already all 64-bit long. On 32-bit platforms, the size may increase by 8 bytes. For backward compatibility, struct bpf_hdr with struct timeval is still the default header unless new time stamp format is explicitly requested. However, the behaviour may change in the future and all relevant code is wrapped around "#ifdef BURN_BRIDGES" for now.
- Add experimental support for tagging mbufs with time stamps from a lower layer, e.g., device driver. Currently, mbuf_tags(9) is used to tag mbufs. The time stamps must be uptime in 'struct bintime' format as binuptime(9) and getbinuptime(9) do.
Reviewed by: net@
|
#
190947 |
|
11-Apr-2009 |
rwatson |
Remove conditionally compiled time counter statistics; tools like DTrace, kernel profiling, etc, can provide this information without the overhead.
MFC after: 3 days Suggested by: bde
|
#
189545 |
|
08-Mar-2009 |
rwatson |
By default, don't compile in counters of calls to various time query functions in the kernel, as these effectively serialize parallel calls to the gettimeofday(2) system call, as well as other kernel services that use timestamps.
Use the NetBSD version of the fix (kern_tc.c:1.32 by ad@) as they have picked up our timecounter code and also ran into the same problem.
Reported by: kris Obtained from: NetBSD MFC after: 3 days
|
#
177253 |
|
16-Mar-2008 |
rwatson |
In keeping with style(9)'s recommendations on macros, use a ';' after each SYSINIT() macro invocation. This makes a number of lightweight C parsers much happier with the FreeBSD kernel source, including cflow's prcc and lxr.
MFC after: 1 month Discussed with: imp, rink
|
#
176351 |
|
17-Feb-2008 |
imp |
Fix typo in comment.
|
#
175058 |
|
02-Jan-2008 |
obrien |
Note what is too {short,long}.
|
#
170289 |
|
04-Jun-2007 |
dwmalone |
Despite several examples in the kernel, the third argument of sysctl_handle_int is not sizeof the int type you want to export. The type must always be an int or an unsigned int.
Remove the instances where a sizeof(variable) is passed to stop people accidently cut and pasting these examples.
In a few places this was sysctl_handle_int was being used on 64 bit types, which would truncate the value to be exported. In these cases use sysctl_handle_quad to export them and change the format to Q so that sysctl(1) can still print them.
|
#
160964 |
|
04-Aug-2006 |
yar |
Commit the results of the typo hunt by Darren Pilgrim. This change affects documentation and comments only, no real code involved.
PR: misc/101245 Submitted by: Darren Pilgrim <darren pilgrim bitfreak org> Tested by: md5(1) MFC after: 1 week
|
#
159669 |
|
16-Jun-2006 |
dwmalone |
Add a kern.timecounter.tc sysctl tree that contains the mask, frequency, quality and current value of each available time counter.
At the moment all of these are read-only, but it might make sense to make some of these read-write in the future.
MFC after: 3 months
|
#
156753 |
|
15-Mar-2006 |
phk |
Disable the "cputick increased..." message now that the dust has settled.
|
#
156485 |
|
09-Mar-2006 |
phk |
Oops, forgot newline.
|
#
156483 |
|
09-Mar-2006 |
phk |
silence cpu_tick calibration and notice only (under bootverbose) when the frequency increases.
|
#
156413 |
|
07-Mar-2006 |
jhb |
Style nit.
|
#
156271 |
|
04-Mar-2006 |
phk |
Add missing cast.
|
#
156270 |
|
04-Mar-2006 |
phk |
More detailed logging if timestepwarnings are enabled.
|
#
156205 |
|
02-Mar-2006 |
phk |
Suffer a little bit of math every 16 second and tighten calibration of cpu_ticks to the low side of PPM.
|
#
155534 |
|
11-Feb-2006 |
phk |
CPU time accounting speedup (step 2)
Keep accounting time (in per-cpu) cputicks and the statistics counts in the thread and summarize into struct proc when at context switch.
Don't reach across CPUs in calcru().
Add code to calibrate the top speed of cpu_tickrate() for variable cpu_tick hardware (like TSC on power managed machines).
Don't enforce monotonicity (at least for now) in calcru. While the calibrated cpu_tickrate ramps up it may not be true.
Use 27MHz counter on i386/Geode.
Use TSC on amd64 & i386 if present.
Use tick counter on sparc64
|
#
155444 |
|
07-Feb-2006 |
phk |
Modify the way we account for CPU time spent (step 1)
Keep track of time spent by the cpu in various contexts in units of "cputicks" and scale to real-world microsec^H^H^H^H^H^H^H^Hclock_t only when somebody wants to inspect the numbers.
For now "cputicks" are still derived from the current timecounter and therefore things should by definition remain sensible also on SMP machines. (The main reason for this first milestone commit is to verify that hypothesis.)
On slower machines, the avoided multiplications to normalize timestams at every context switch, comes out as a 5-7% better score on the unixbench/context1 microbenchmark. On more modern hardware no change in performance is seen.
|
#
150348 |
|
19-Sep-2005 |
andre |
Start time_uptime with 1 instead of 0.
Discussed with: phk
|
#
149848 |
|
07-Sep-2005 |
obrien |
Forward declaring static variables as extern is invalid ISO-C. Now that GCC can properly handle forward static declarations, do this properly.
|
#
144152 |
|
26-Mar-2005 |
phk |
s/ENOTTY/ENOIOCTL/
|
#
136404 |
|
11-Oct-2004 |
peter |
Put on my peril sensitive sunglasses and add a flags field to the internal sysctl routines and state. Add some code to use it for signalling the need to downconvert a data structure to 32 bits on a 64 bit OS when requested by a 32 bit app.
I tried to do this in a generic abi wrapper that intercepted the sysctl oid's, or looked up the format string etc, but it was a real can of worms that turned into a fragile mess before I even got it partially working.
With this, we can now run 'sysctl -a' on a 32 bit sysctl binary and have it not abort. Things like netstat, ps, etc have a long way to go.
This also fixes a bug in the kern.ps_strings and kern.usrstack hacks. These do matter very much because they are used by libc_r and other things.
|
#
133714 |
|
14-Aug-2004 |
phk |
Add some KASSERTS.
|
#
126600 |
|
04-Mar-2004 |
phk |
Just because the timecounter reads the same value on two samples after each other doesn't mean that nothing happened.
|
#
124842 |
|
22-Jan-2004 |
phk |
Write 100 times for tomorrow: "Always print time_t as %jd, you never know what width it has"
|
#
124812 |
|
21-Jan-2004 |
phk |
Add a sysctl (default: off) which enables a log(LOG_INFO...) warning if the clock is stepped.
|
#
122610 |
|
13-Nov-2003 |
phk |
Various minor details: Give the HZ/overflow check a 10% margin. Eliminate bogus newline. If timecounters have equal quality, prefer higher frequency.
Some inspiration from: bde
|
#
119716 |
|
03-Sep-2003 |
phk |
Use the quality to disable timecounters for which we deem Hz too low.
|
#
119183 |
|
20-Aug-2003 |
imp |
bde made a number of suggested improvements to the code. This commit represents the pruely stylistic changes and should have no net impact on the rest of the code.
bde's more substantive changes will follow in a separate commit once we've come to closure on them.
Submitted by: bde
|
#
119160 |
|
20-Aug-2003 |
imp |
Fix an extreme edge case in leap second handling. We need to call ntp_update_second twice when we have a large step in case that step goes across a scheduled leap second. The only way this could happen would be if we didn't call tc_windup over the end of day on the day of a leap second, which would only happen if timeouts were delayed for seconds. While it is an edge case, it is an important one to get right for my employer.
Sponsored by: Timing Solutions Corporation
|
#
118987 |
|
16-Aug-2003 |
phk |
Give timecounters a numeric quality field.
A timecounter will be selected when registered if its quality is not negative and no less than the current timecounters.
Add a sysctl to report all available timecounters and their qualities.
Give the dummy timecounter a solid negative quality of minus a million.
Give the i8254 zero and the ACPI 1000.
The TSC gets 800, unless APM or SMP forces it negative.
Other timecounters default to zero quality and thereby retain current selection behaviour.
|
#
118842 |
|
12-Aug-2003 |
mux |
Remove extra space.
|
#
117148 |
|
02-Jul-2003 |
phk |
typo fix in comment.
|
#
116841 |
|
25-Jun-2003 |
imp |
Fix leap second processing by the kernel time keeping routines. Before, we would add/subtract the leap second when the system had been up for an even multiple of days, rather than at the end of the day, as a leap second is defined (at least wrt ntp). We do this by calculating the notion of UTC earlier in the loop, and passing that to get it adjusted. Any adjustments that ntp_update_second makes to this time are then transferred to boot time. We can't pass it either the boot time or the uptime because their sum is what determines when a leap second is needed. This code adds an extra assignment and two extra compare in the typical case, which is as cheap as I could made it.
I have confirmed with this code the kernel time does the correct thing for both positive and negative leap seconds. Since the ntp interface doesn't allow for +2 or -2, those cases can't be tested (and the folks in the know here say there will never be a +2s or -2s leap event, but rather two +1s or -1s leap events).
There will very likely be no leap seconds for a while, given how the earth is speeding up and slowing down, so there will be plenty of time for this fix to propigate. UT1-UTC is currently at "about -0.4s" and decrementing by .1s every 8 months or so. 6 * 8 is 48 months, or 4 years.
-stable has different code, but a similar bug that was introduced about the time of the last leap second, which is why nobody has noticed until now.
MFC After: 3 weeks Reviewed by: phk
"Furthermore, leap seconds must die." -- Cato the Elder
|
#
116756 |
|
23-Jun-2003 |
imp |
Use UTC rather than GMT to describe time scale. latter is obsolete.
|
#
116182 |
|
10-Jun-2003 |
obrien |
Use __FBSDID().
|
#
112367 |
|
18-Mar-2003 |
phk |
Including <sys/stdint.h> is (almost?) universally only to be able to use %j in printfs, so put a newsted include in <sys/systm.h> where the printf prototype lives and save everybody else the trouble.
|
#
110038 |
|
29-Jan-2003 |
phk |
Move timecounters notion of frequency to 64 bits.
[WARNING: CPUs in the distant future may be closer than they appear!]
|
#
109813 |
|
25-Jan-2003 |
phk |
Add sysctl kern.timecounter.nsetclock which indicates the number of potential discontinuities in our UTC timescale.
Applications can monitor this variable if they want to be informed about steps in the timescale. Slews (ntp and adjtime(2)) and frequency adjustments (ntp) will not increment this counter, only operations which set the clock. No attempt is made to classify size or direction of the step.
|
#
109392 |
|
16-Jan-2003 |
phk |
Move a local variable to avoid the compiler warning about it being unused.
|
#
109391 |
|
16-Jan-2003 |
jhay |
hardpps() wants the raw hardware counter value converted to nanoseconds.
|
#
108755 |
|
05-Jan-2003 |
peter |
Explicitly have the timecounter init happen after the cpu_initclocks is called. Otherwise (depending on a non-deterministic sort), the timecounter code can be initialized before the clock rate has been set (on ia64) and it assumes hz = 100, rather than the real value of 1024. I'm not sure how much gets upset by this.
Glanced at by: phk
|
#
108671 |
|
04-Jan-2003 |
phk |
Export tc_tick with sysctl, not tick.
Spotted by: bde
|
#
106304 |
|
01-Nov-2002 |
phk |
Introduce a "time_uptime" global variable which holds the time since boot in seconds.
|
#
105354 |
|
17-Oct-2002 |
robert |
Use strlcpy() instead of strncpy() to copy NUL terminated strings for safety and consistency.
|
#
102933 |
|
04-Sep-2002 |
phk |
Do not employ timecounter hardware if our hz does not support their correct rewinding.
|
#
102926 |
|
04-Sep-2002 |
phk |
Give up on calling tc_ticktock() from a timeout, we have timeout functions which run for several milliseconds at a time and getting in queue behind one or more of those makes us miss our rewind.
Instead call it from hardclock() like we used to do, but retain the prescaler so we still cope with high HZ values.
|
#
100074 |
|
15-Jul-2002 |
markm |
Use a semicolon at the end of a function-like macro invocation. Kills warnings and makes the visual style easier.
|
#
98120 |
|
11-Jun-2002 |
kbyanc |
Time counter stats are unsigned, advertise them to sysctl(8) that way.
PR: (one small part of) 19720 Approved by: phk
|
#
97610 |
|
30-May-2002 |
phk |
Mistyped and lost a '&' in previous commit.
|
#
97573 |
|
30-May-2002 |
phk |
Don't forget to factor in the boottime when we calculate PPS timestamps.
Submitted by: Akira Watanabe <akira@myaw.ei.meisei-u.ac.jp>
|
#
95976 |
|
03-May-2002 |
phk |
Initialize time_second to 1 instead of zero to pacify slightly bogus arp code.
Various minor style fixes from BDE.
|
#
95834 |
|
30-Apr-2002 |
peter |
kern_tc.c doesn't use <machine/psl.h>, and having this #include breaks other platforms.
|
#
95821 |
|
30-Apr-2002 |
phk |
Brucifixion ? Yes, out that door, row on the left, one patch each.
Many thanks to: bde
|
#
95661 |
|
28-Apr-2002 |
phk |
Stylistic sweep through the timecounter code. Renovate comments.
|
#
95659 |
|
28-Apr-2002 |
phk |
Don't screw up our uptime with historical dates.
|
#
95551 |
|
27-Apr-2002 |
phk |
Explain magic number.
Add magic date no explanation.
Add a delta which was lost in transit yesterday which prevented other timecounters from actually being used.
|
#
95549 |
|
27-Apr-2002 |
phk |
Make the dummy timecounter actually tick or we will never get anyhere.
|
#
95530 |
|
26-Apr-2002 |
phk |
Now that the private parts of timecounters are no longer being fingered by other bits of code, split struct timecounter into two.
struct timecounter contains just the bits which pertains to the hardware counter and the reading of it.
struct timehands (as in "the hands on a clock") contains all the ugly bit fidling stuff. Statically compile ten timehands.
This commit is the functional part. A later cosmetic patch will rename various variables and fieldnames.
|
#
95529 |
|
26-Apr-2002 |
phk |
Hide the private parts of timecounter from a couple of places that don't really need to know the gory details.
|
#
95523 |
|
26-Apr-2002 |
phk |
Simplify the RFC2783 and PPS_SYNC timestamp collection API.
|
#
95497 |
|
26-Apr-2002 |
phk |
Move the winding of timecounters out of hardclock and into a normal timeout loop.
Limit the rate at which we wind the timecounters to approx 1000 Hz.
This limits the precision of the get{bin,nano,micro}[up]time(9) functions to roughly a millisecond.
|
#
95491 |
|
26-Apr-2002 |
phk |
Various cleanup and sorting of clock reading functions. Add the two functions missing in the complete 12 function complement.
|
#
95490 |
|
26-Apr-2002 |
phk |
Rename tco_setscales() and tco_delta() to use the same tc_ prefix as the rest of this file.
|
#
95489 |
|
26-Apr-2002 |
phk |
Remove the tc_update() function. Any frequency change to the timecounter will be used starting at the next second, which is good enough for sysctl purposes. If better adjustment is needed the NTP PLL should be used.
|
#
94754 |
|
15-Apr-2002 |
phk |
Improve the implementation of adjtime(2).
Apply the change as a continuous slew rather than as a series of discrete steps and make it possible to adjust arbitraryly huge amounts of time in either direction.
In practice this is done by hooking into the same once-per-second loop as the NTP PLL and setting a suitable frequency offset deducting the amount slewed from the remainder. If the remaining delta is larger than 1 second we slew at 5000PPM (5msec/sec), for a delta less than a second we slew at 500PPM (500usec/sec) and for the last one second period we will slew at whatever rate (less than 500PPM) it takes to eliminate the delta entirely.
The old implementation stepped the clock a number of microseconds every HZ to acheive the same effect, using the same rates of change.
Eliminate the global variables tickadj, tickdelta and timedelta and their various use and initializations.
This removes the most significant obstacle to running timecounter and NTP housekeeping from a timeout rather than hardclock.
|
#
93343 |
|
28-Mar-2002 |
phk |
Get the magnitude of the NTP adjustment right.
|
#
92723 |
|
19-Mar-2002 |
alfred |
Remove __P.
|
#
91290 |
|
26-Feb-2002 |
phk |
Remove unused variable.
|
#
91200 |
|
24-Feb-2002 |
phk |
Add a generation number to timecounters and spin if it changes under our feet when we look inside timecounter structures.
Make the "sync_other" code more robust by never overwriting the tc_next field.
Add counters for the bin[up]time functions.
Call tc_windup() in tc_init() and switch_timecounter() to make sure we all the fields set right.
|
#
91065 |
|
22-Feb-2002 |
phk |
Use better scaling factor for NTPs correction. Explain the magic.
|
#
90362 |
|
07-Feb-2002 |
phk |
Revise timercounters to use binary fixed point format internally.
The binary format "bintime" is a 32.64 format, it will go to 64.64 when time_t does.
The bintime format is available to consumers of time in the kernel, and is preferable where timeintervals needs to be accumulated.
This change simplifies much of the magic math inside the timecounters and improves the frequency and time precision by a couple of bits.
I have not been able to measure a performance difference which was not a tiny fraction of the standard deviation on the measurements.
|
#
90260 |
|
05-Feb-2002 |
phk |
Let the number of timecounters follow hz, otherwise people with HZ=BIGNUM will strain the assumptions behind timecounters to the point where they break.
This may or may not help people seeing microuptime() backwards messages.
Make the global timecounter variable volatile, it makes no difference in the code GCC generates, but it makes represents the intent correctly.
Thanks to: jdp MFC after: 2 weeks
|
#
89801 |
|
25-Jan-2002 |
phk |
Be more conservative about interrupt latency, it aint getting better it seems.
|
#
70574 |
|
01-Jan-2001 |
phk |
Remove a bogus #ifdef KTR stanza.
Noticed by: Alexander Langer <alex@big.endian.de>
|
#
65557 |
|
06-Sep-2000 |
jasone |
Major update to the way synchronization is done in the kernel. Highlights include:
* Mutual exclusion is used instead of spl*(). See mutex(9). (Note: The alpha port is still in transition and currently uses both.)
* Per-CPU idle processes.
* Interrupts are run in their own separate kernel threads and can be preempted (i386 only).
Partially contributed by: BSDi (BSD/OS) Submissions by (at least): cp, dfr, dillon, grog, jake, jhb, sheldonh
|
#
62573 |
|
04-Jul-2000 |
phk |
Previous commit changing SYSCTL_HANDLER_ARGS violated KNF.
Pointed out by: bde
|
#
62454 |
|
03-Jul-2000 |
phk |
Style police catches up with rev 1.26 of src/sys/sys/sysctl.h:
Sanitize SYSCTL_HANDLER_ARGS so that simplistic tools can grog our sources:
-sysctl_vm_zone SYSCTL_HANDLER_ARGS +sysctl_vm_zone (SYSCTL_HANDLER_ARGS)
|
#
58377 |
|
20-Mar-2000 |
phk |
Isolate the Timecounter internals in their own two files.
Make the public interface more systematically named.
Remove the alternate method, it doesn't do any good, only ruins performance.
Add counters to profile the usage of the 8 access functions.
Apply the beer-ware to my code.
The weird +/- counts are caused by two repocopies behind the scenes: kern/kern_clock.c -> kern/kern_tc.c sys/time.h -> sys/timetc.h (thanks peter!)
|
#
57182 |
|
13-Feb-2000 |
phk |
Fix sign reversal in adjtime(2).
Approved by: jkh
|
#
54299 |
|
08-Dec-1999 |
phk |
Make adjtime(2) adjust boottime so it doesn't cause non-monotonous uptime.
|
#
53751 |
|
27-Nov-1999 |
bde |
Fixed some comments in statclock(). The previous commit made it clearer that one comment was attached to null code.
|
#
53745 |
|
27-Nov-1999 |
bde |
Moved scheduling-related code to kern_synch.c so that it is easier to fix and extend. The new function containing the code is named schedclock() as in NetBSD, but it has slightly different semantics (it already handles incrementation of p->p_cpticks, and it should handle any calling frequency).
Agreed with in principle by: dufault
|
#
52097 |
|
10-Oct-1999 |
peter |
#ifdef PPS_SYNC around "kapi" declaration to fix a -Wunused warning.
|
#
52061 |
|
09-Oct-1999 |
jhay |
Update the PPSAPI to draft-mogul-pps-api-05.txt which is the latest.
NOTE: This will break building ntpd until ntpd has been upgraded to also support draft 05. People that want to build ntpd in the meantime can get patches from me.
|
#
51229 |
|
13-Sep-1999 |
bde |
Moved the definition of `boottime' and its sysctl to the correct file.
|
#
50477 |
|
27-Aug-1999 |
peter |
$Id$ -> $FreeBSD$
|
#
49065 |
|
24-Jul-1999 |
bde |
Oops, the previous commit only worked in the one case it was tested for.
|
#
48887 |
|
18-Jul-1999 |
bde |
Added a sysctl "kern.timecounter.hardware" for selecting the hardware used for timecounting. The possible values are the names of the physically present harware timecounters ("i8254" and "TSC" on i386's).
Fixed some nearby bitrot in comments in <sys/time.h>.
Reviewed by: phk
|
#
48872 |
|
17-Jul-1999 |
jdp |
Remove four no-op casts.
|
#
46054 |
|
25-Apr-1999 |
phk |
Make the machdep.i8254_freq and machdep.tsc_freq sysctls modify the timecounter as well
Asked for by: bde, jhay
|
#
45245 |
|
02-Apr-1999 |
phk |
Don't open window for race condition.
Detected by: Reg Clemens <reg@dwf.com>
|
#
44695 |
|
12-Mar-1999 |
phk |
Fix an old cut&paste bogon.
Noticed by: bde
|
#
44685 |
|
12-Mar-1999 |
phk |
Remove duplicate include.
Noticed by: bde
|
#
44666 |
|
11-Mar-1999 |
phk |
Make even more of the PPSAPI implementations generic.
FLL support in hardpps()
Various magic shuffles and improved comments
Style fixes from Bruce.
|
#
44574 |
|
08-Mar-1999 |
phk |
Integrate the new "nanokernel" PLL from Dave Mills.
This code is backwards compatible with the older "microkernel" PLL, but allows ntpd v4 to use nanosecond resolution. Many other improvements.
PPS_SYNC and hardpps() are NOT supported yet.
|
#
44157 |
|
19-Feb-1999 |
luoqi |
Introduce machine-dependent macro pgtok() to convert page count to number of kilobytes. Its definition for each architecture could be optimized to avoid potential numerical overflows.
|
#
44146 |
|
19-Feb-1999 |
luoqi |
Hide access to vmspace:vm_pmap with inline function vmspace_pmap(). This is the preparation step for moving pmap storage out of vmspace proper.
Reviewed by: Alan Cox <alc@cs.rice.edu> Matthew Dillion <dillon@apollo.backplane.com>
|
#
41415 |
|
29-Nov-1998 |
phk |
Make the previous behaviour the default, add a sysctl which you can set if your hw/sw produces the "calcru negative..." message.
Setting the alternate method (sysctl -w kern.timecounter.method=1) makes the the get{nano|micro}*() functions call the real thing at resulting in a measurable but minor overhead.
I decided to NOT have the "calcru" change the method automatically because you should be aware of this problem if you have it.
The problems currently seen, related to usleep and a few other corners are fixed for both methods.
|
#
41306 |
|
23-Nov-1998 |
phk |
Make timecounters more resistant to badly behaved SW/HW which locks out interrupts for too long. If you still see the "calcru: negative time..." message you can increase NTIMECOUNTER (see LINT).
Sideeffect is that a timecounter is required to not wrap around in less than (1 + delta) seconds instead of the (1/hz + delta) required until now.
Many thanks to: msmith, wpaul, wosch & bde
|
#
41305 |
|
23-Nov-1998 |
sos |
Add a kludge to prevent panicing when using VM86 and hitting here with a NULL curproc.
Originally by: Tor Egge (IIRC)
|
#
40657 |
|
26-Oct-1998 |
bde |
Fixed breakage of the GPROF case of statclock() in the previous commit.
|
#
40648 |
|
25-Oct-1998 |
phk |
Nitpicking and dusting performed on a train. Removes trivial warnings about unused variables, labels and other lint.
|
#
40609 |
|
23-Oct-1998 |
phk |
Change the way we simulate stable storage for timecounters.
If you have problems with the "calcru" messages and processes being killed for excessive cpu time, try to increase the NTIMECOUNTER #define and report your findings.
|
#
40012 |
|
06-Oct-1998 |
alex |
Cast the return value of tvtohz() from a long to an int to satisfy the compiler that we know what we're doing (the value returned has already been restricted to int ranges).
Reviewed by: bde
|
#
39245 |
|
15-Sep-1998 |
gibbs |
kern_clock.c: Remove old disk statistics variables.
vfs_bio.c: Enable bowrite now that B_ORDERED works for all buffer devices.
|
#
38129 |
|
05-Aug-1998 |
bde |
Removed unused function hzto().
|
#
37555 |
|
11-Jul-1998 |
bde |
Fixed printf format errors.
|
#
37383 |
|
04-Jul-1998 |
phk |
Hmm, braino in last commit.
|
#
37382 |
|
04-Jul-1998 |
phk |
Change the sign on a race-condition, so that instead of ending up several tens of milliseconds out in the future we end up the right place with a subweeniesecond error.
|
#
37345 |
|
02-Jul-1998 |
phk |
When we transfer time from one timecounter to the next, use nanouptime(), not nanotime(); Otherwise we end up in 2026...
Fix the arg to dummy_get_timecount()
|
#
36810 |
|
09-Jun-1998 |
phk |
Add a tc_ prefix to struct timecounter members.
Urged by: bde
|
#
36741 |
|
07-Jun-1998 |
phk |
Add a member function more to the timecounters, this one is for use with latch based PPS implementations. The client that uses it will be committed after more testing.
|
#
36719 |
|
07-Jun-1998 |
phk |
Add a "this" style argument and a "void *private" so timecounters can figure out which instance to wount with.
|
#
36441 |
|
28-May-1998 |
phk |
Some cleanups related to timecounters and weird ifdefs in <sys/time.h>.
Clean up (or if antipodic: down) some of the msgbuf stuff.
Use an inline function rather than a macro for timecounter delta.
Maintain process "on-cpu" time as 64 bits of microseconds to avoid needless second rollover overhead.
Avoid calling microuptime the second time in mi_switch() if we do not pass through _idle in cpu_switch()
This should reduce our context-switch overhead a bit, in particular on pre-P5 and SMP systems.
WARNING: Programs which muck about with struct proc in userland will have to be fixed.
Reviewed, but found imperfect by: bde
|
#
36199 |
|
19-May-1998 |
phk |
Change a data type internal to the timecounters, and remove the "delta" function.
Reviewed, but not entirely approved by: bde
|
#
36119 |
|
17-May-1998 |
phk |
s/nanoruntime/nanouptime/g s/microruntime/microuptime/g
Reviewed by: bde
|
#
35099 |
|
08-Apr-1998 |
phk |
Minor adjustments to the timecounting and proc0.
Mostly Submitted by: bde
|
#
35058 |
|
06-Apr-1998 |
phk |
Make a kernel version of the timer* functions called timerval* to be more consistent.
OK'ed by: bde
|
#
35044 |
|
05-Apr-1998 |
phk |
Make the dummy timecounter run at 1 MHz rather than 100kHz (noticed by bde) fix the itimer(REAL) handling.
|
#
35033 |
|
04-Apr-1998 |
phk |
Handle double fraction overflow in nano & microtime functions (spotted by Bruce) Use tvtohz() a place where it fits.
|
#
35029 |
|
04-Apr-1998 |
phk |
Time changes mark 2:
* Figure out UTC relative to boottime. Four new functions provide time relative to boottime.
* move "runtime" into struct proc. This helps fix the calcru() problem in SMP.
* kill mono_time.
* add timespec{add|sub|cmp} macros to time.h. (XXX: These may change!)
* nanosleep, select & poll takes long sleeps one day at a time
Reviewed by: bde Tested by: ache and others
|
#
34975 |
|
31-Mar-1998 |
phk |
Fix an off by 1<<32 error.
|
#
34974 |
|
31-Mar-1998 |
phk |
Add a dummy timecounter until we find the real thing(s).
|
#
34961 |
|
30-Mar-1998 |
phk |
Eradicate the variable "time" from the kernel, using various measures. "time" wasn't a atomic variable, so splfoo() protection were needed around any access to it, unless you just wanted the seconds part.
Most uses of time.tv_sec now uses the new variable time_second instead.
gettime() changed to getmicrotime(0.
Remove a couple of unneeded splfoo() protections, the new getmicrotime() is atomic, (until Bruce sets a breakpoint in it).
A couple of places needed random data, so use read_random() instead of mucking about with time which isn't random.
Add a new nfs_curusec() function.
Mark a couple of bogosities involving the now disappeard time variable.
Update ffs_update() to avoid the weird "== &time" checks, by fixing the one remaining call that passwd &time as args.
Change profiling in ncr.c to use ticks instead of time. Resolution is the same.
Add new function "tvtohz()" to avoid the bogus "splfoo(), add time, call hzto() which subtracts time" sequences.
Reviewed by: bde
|
#
34901 |
|
26-Mar-1998 |
phk |
Add two new functions, get{micro|nano}time.
They are atomic, but return in essence what is in the "time" variable. gettime() is now a macro front for getmicrotime().
Various patches to use the two new functions instead of the various hacks used in their absence.
Some puntuation and grammer patches from Bruce.
A couple of XXX comments.
|
#
34618 |
|
16-Mar-1998 |
phk |
A bunch of BNN (Bruce Normal Nits) from bde: Bring back the softclock inlining save a couple of <<32's many white-space shuffles.
|
#
33690 |
|
20-Feb-1998 |
phk |
Replace TOD clock code with more systematic approach.
Highlights: * Simple model for underlying hardware. * Hardware basis for timekeeping can be changed on the fly. * Only one hardware clock responsible for TOD keeping. * Provides a real nanotime() function. * Time granularity: .232E-18 seconds. * Frequency granularity: .238E-12 s/s * Frequency adjustment is continuous in time. * Less overhead for frequency adjustment. * Improves xntpd performance.
Reviewed by: bde, bde, bde
|
#
33391 |
|
15-Feb-1998 |
phk |
Add a nanotime() function so that we can start to use this call.
|
#
33134 |
|
06-Feb-1998 |
eivind |
Back out DIAGNOSTIC changes.
|
#
33108 |
|
04-Feb-1998 |
eivind |
Turn DIAGNOSTIC into a new-style option.
|
#
32513 |
|
14-Jan-1998 |
phk |
Move almost all the ntp related stuff from kern_clock.c to kern_ntptime.c. The only bit left over is that which is executed in all calls to hardclock(). Various cleanups and staticizing along the road.
|
#
32444 |
|
11-Jan-1998 |
phk |
Try to solve timeout race by not touching softtics here.
|
#
32412 |
|
10-Jan-1998 |
phk |
Fix softclock calling so we don't loose timeouts (I broke this ~10h ago)
|
#
32391 |
|
10-Jan-1998 |
phk |
Whoops. softclock is called from doreti_swi as well. Abandon call from hardclock().
Forgot this:
Pointed hat sent by: bd
|
#
32388 |
|
10-Jan-1998 |
phk |
Effect the divorce of kern_clock.c and kern_timeout.c (which was repository copied from kern_clock.c)
|
#
32323 |
|
07-Jan-1998 |
phk |
Improve hardpps readability a bit: * Rename usec to p_usec so you can search for it. * Macroize the huge median_of_3_samples if statement.
|
#
31950 |
|
23-Dec-1997 |
nate |
This patch causes the "calltodo" timer list to be decremented by the amount of time that the laptop was suspending. Thus, select() calls that might have suspended rather than firing at 1hr + "time suspended" since the timer was posted.
Adding:
options APM_FIXUP_CALLTODO
to the kernel config enables the patch.
[ This patch was slightly modified to use a consistant indent style and I removed some unused local variables. After this has been tested a few weeks we'll make the options the default, so for now I'm now documenting it in LINT. Mike can later if he wants. ]
Reviewed by: Mike Smith <msmith@freebsd.org> Submitted by: Ken Key <key@cs.utk.edu>
|
#
31639 |
|
08-Dec-1997 |
fsmp |
The improvements to clock statistics by Tor Egge Wrappered and enabled by the define BETTER_CLOCK (on by default in smpyests.h)
Reviewed by: smp@csn.net Submitted by: Tor Egge <Tor.Egge@idi.ntnu.no>
|
#
31393 |
|
24-Nov-1997 |
bde |
Removed all traces of P_IDLEPROC. It was tested but never set.
|
#
31259 |
|
18-Nov-1997 |
bde |
Removed unused #include.
|
#
31016 |
|
07-Nov-1997 |
phk |
Remove a bunch of variables which were unused both in GENERIC and LINT.
Found by: -Wunused
|
#
29805 |
|
24-Sep-1997 |
gibbs |
Store an absolute tick value in callout entries so that a subtraction on hash chain traversal isn't needed. This also allows untimeout to recompute the hash to find the bucket that the entry to remove is stored in so that each callout entry no longer needs to store that information.
Reviewed by: Nate Williams <nate@mt.sri.com>
|
#
29680 |
|
21-Sep-1997 |
gibbs |
init_main.c subr_autoconf.c: Add support for "interrupt driven configuration hooks". A component of the kernel can register a hook, most likely during auto-configuration, and receive a callback once interrupt services are available. This callback will occur before the root and dump devices are configured, so the configuration task can affect the selection of those two devices or complete any tasks that need to be performed prior to launching init. System boot is posponed so long as a hook is registered. The hook owner is responsible for removing the hook once their task is complete or the system boot can continue.
kern_acct.c kern_clock.c kern_exit.c kern_synch.c kern_time.c: Change the interface and implementation for the kernel callout service. The new implemntaion is based on the work of Adam M. Costello and George Varghese, published in a technical report entitled "Redesigning the BSD Callout and Timer Facilities". The interface used in FreeBSD is a little different than the one outlined in the paper. The new function prototypes are:
struct callout_handle timeout(void (*func)(void *), void *arg, int ticks);
void untimeout(void (*func)(void *), void *arg, struct callout_handle handle);
If a client wishes to remove a timeout, it must store the callout_handle returned by timeout and pass it to untimeout.
The new implementation gives 0(1) insert and removal of callouts making this interface scale well even for applications that keep 100s of callouts outstanding.
See the updated timeout.9 man page for more details.
|
#
29179 |
|
07-Sep-1997 |
bde |
Some staticized variables were still declared to be extern.
|
#
29041 |
|
02-Sep-1997 |
bde |
Removed unused #includes.
|
#
28551 |
|
21-Aug-1997 |
bde |
#include <machine/limits.h> explicitly in the few places that it is required.
|
#
26897 |
|
24-Jun-1997 |
jhay |
Add tickadj to struct clockinfo, like NetBSD and OpenBSD. NOTE: libc, time, kgmon and rpc.rstatd will have to be recompiled.
|
#
25164 |
|
26-Apr-1997 |
peter |
Man the liferafts! Here comes the long awaited SMP -> -current merge!
There are various options documented in i386/conf/LINT, there is more to come over the next few days.
The kernel should run pretty much "as before" without the options to activate SMP mode.
There are a handful of known "loose ends" that need to be fixed, but have been put off since the SMP kernel is in a moderately good condition at the moment.
This commit is the result of the tinkering and testing over the last 14 months by many people. A special thanks to Steve Passe for implementing the APIC code!
|
#
24117 |
|
22-Mar-1997 |
mpp |
Restore Bruce's original comment. It seems that "iff" = if and only if, and is not a typo. It is used other places in the kernel, too.
|
#
24109 |
|
22-Mar-1997 |
mpp |
Fix a typo in a comment of a recent commit.
|
#
24101 |
|
22-Mar-1997 |
bde |
Fixed some invalid (non-atomic) accesses to `time', mostly ones of the form `tv = time'. Use a new function gettime(). The current version just forces atomicicity without fixing precision or efficiency bugs. Simplified some related valid accesses by using the central function.
|
#
22975 |
|
22-Feb-1997 |
peter |
Back out part 1 of the MCFH that changed $Id$ to $FreeBSD$. We are not ready for it yet.
|
#
22521 |
|
10-Feb-1997 |
dyson |
This is the kernel Lite/2 commit. There are some requisite userland changes, so don't expect to be able to run the kernel as-is (very well) without the appropriate Lite/2 userland changes.
The system boots and can mount UFS filesystems.
Untested: ext2fs, msdosfs, NFS Known problems: Incorrect Berkeley ID strings in some files. Mount_std mounts will not work until the getfsent library routine is changed.
Reviewed by: various people Submitted by: Jeffery Hsu <hsu@freebsd.org>
|
#
21673 |
|
14-Jan-1997 |
jkh |
Make the long-awaited change from $Id$ to $FreeBSD$
This will make a number of things easier in the future, as well as (finally!) avoiding the Id-smashing problem which has plagued developers for so long.
Boy, I'm glad we're not using sup anymore. This update would have been insane otherwise.
|
#
21101 |
|
30-Dec-1996 |
jhay |
Update our kernel ntp code to the latest from David Mills. The main change is the addition of the FLL code, which is used by the latest versions of xntpd. The kernel PPS code is also updated, although I can't test that yet.
|
#
19172 |
|
25-Oct-1996 |
bde |
Improved biasing of i586 clock by adjusting for hardclock() latency. I decided to do this for every hardclock() call instead of lazily in microtime(). The lazy method is simpler but has more overhead if microtime() is called a lot.
CPU_THISTICKLEN() is now a no-op and should probably go away. Previously it did nothing directly but had the side effect of setting i586_last_tick for CPU_CLOCKUPDATE() and i586_avg_tick for debugging. CPU_CLOCKUPDATE() now uses a better method and i586_avg_tick is too much trouble to maintain.
Reduced nesting of #includes in the usual case.
Increased nesting of #includes when CLOCK_HAIR is defined. This is a kludge to get typedefs for inline functions only when the inline functions are used. Normally only kern_clock.c defines this. kern_clock.c can't include the i386 headers directly.
Removed unused LOCORE support.
|
#
18855 |
|
10-Oct-1996 |
bde |
Don't include "opt_cpu.h" in <machine/clock.h>, since this breaks lkm's. The change breaks kern_clock.c; fix that temporarily by including "opt_cpu.h" there.
|
#
17342 |
|
30-Jul-1996 |
bde |
Fixed resource usage integrals. They were too large by a factor of of profhz/stathz when profiling was enabled.
|
#
16635 |
|
23-Jun-1996 |
bde |
Unstaticize psratio and staticize profprocs. psratio needs to be exported to trap.c to fix user profiling.
|
#
12913 |
|
17-Dec-1995 |
phk |
Staticize. Unstaticize a function in scsi/scsi_base that was used, with an undocumented option. My last count on the LINT kernel shows: Total symbols: 3647 unref symbols: 463 undef symbols: 4 1 ref symbols: 1751 2 ref symbols: 485 Approaching the pain threshold now.
|
#
12662 |
|
07-Dec-1995 |
dg |
Untangled the vm.h include file spaghetti.
|
#
12650 |
|
06-Dec-1995 |
phk |
A couple of minor tweaks to the sysctl stuff.
|
#
12623 |
|
04-Dec-1995 |
phk |
A major sweep over the sysctl stuff.
Move a lot of variables home to their own code (In good time before xmas :-)
Introduce the string descrition of format.
Add a couple more functions to poke into these marvels, while I try to decide what the correct interface should look like.
Next is adding vars on the fly, and sysctl looking at them too.
Removed a tine bit of defunct and #ifdefed notused code in swapgeneric.
|
#
12569 |
|
02-Dec-1995 |
bde |
Finished (?) cleaning up sysinit stuff.
|
#
12243 |
|
12-Nov-1995 |
phk |
The entire sysctl callback to read/write version. I havn't tested this as much as I'd like to, but the malloc stunt I tried for an interim for sure does worse. Now we can read and write from any kind of address-space, not only user and kernel, using callbacks. This may be over-generalization for now, but it's actually simpler.
|
#
12152 |
|
08-Nov-1995 |
phk |
Fix some of the sysctl broke, and add a lot more to it.
|
#
11451 |
|
12-Oct-1995 |
wollman |
Improve clock accuracy by accounting for late/missed clock interrupts if the hardware supports it.
|
#
10653 |
|
09-Sep-1995 |
dg |
Fixed init functions argument type - caddr_t -> void *. Fixed a couple of compiler warnings.
|
#
10358 |
|
28-Aug-1995 |
julian |
Reviewed by: julian with quick glances by bruce and others Submitted by: terry (terry lambert) This is a composite of 3 patch sets submitted by terry. they are: New low-level init code that supports loadbal modules better some cleanups in the namei code to help terry in 16-bit character support some changes to the mount-root code to make it a little more modular..
NOTE: mounting root off cdrom or NFS MIGHT be broken as I haven't been able to test those cases..
certainly mounting root of disk still works just fine.. mfs should work but is untested. (tomorrows task)
The low level init stuff includes a total rewrite of init_main.c to make it possible for new modules to have an init phase by simply adding an entry to a TEXT_SET (or is it DATA_SET) list. thus a new module can be added to the kernel without editing any other files other than the 'files' file.
|
#
9759 |
|
29-Jul-1995 |
bde |
Eliminate sloppy common-style declarations. There should be none left for the LINT configuation.
|
#
8876 |
|
30-May-1995 |
rgrimes |
Remove trailing whitespace.
|
#
7090 |
|
16-Mar-1995 |
bde |
Add and move declarations to fix all of the warnings from `gcc -Wimplicit' (except in netccitt, netiso and netns) and most of the warnings from `gcc -Wnested-externs'. Fix all the bugs found. There were no serious ones.
|
#
5081 |
|
12-Dec-1994 |
bde |
Obtained from: my old fix for 1.1.5
Improve hzto():
Round up instead of down and then add 1 tick. This fixes sleep(1) sometimes sleeping for < 1 second and usleep(10000) sometimes sleeping for as little as 1 usec + syscall time.
Don't do all the calculations at splhigh().
Don't depend on `tick' being a multiple of 1000.
Don't lose accuracy for `sec' between 0x7fffffff / 1000 - 1000 and 0x7fffffff / hz.
Don't assume that longs are 32 bits or that ints have the same size as longs.
|
#
3640 |
|
16-Oct-1994 |
wollman |
kern_clock.c: define dk_names[][]. kern_sysctl.c: call dev_sysctl for hw.devconf mib subtree kern_devconf.c: sysctl-accessible device-configuration and -management interface
|
#
3308 |
|
02-Oct-1994 |
phk |
All of this is cosmetic. prototypes, #includes, printfs and so on. Makes GCC a lot more silent.
|
#
3183 |
|
28-Sep-1994 |
wollman |
Fixed bug in hardclock() that caused adjtime() to fail when given a negative offset. This would be seen in xntpd as a rash of ``Previous time adjustment didn't complete'' messages on startup.
|
#
3098 |
|
25-Sep-1994 |
phk |
While in the real world, I had a bad case of being swapped out for a lot of cycles. While waiting there I added a lot of the extra ()'s I have, (I have never used LISP to any extent). So I compiled the kernel with -Wall and shut up a lot of "suggest you add ()'s", removed a bunch of unused var's and added a couple of declarations here and there. Having a lap-top is highly recommended. My kernel still runs, yell at me if you kernel breaks.
|
#
2858 |
|
18-Sep-1994 |
wollman |
Redo Kernel NTP PLL support, kernel side.
This code is mostly taken from the 1.1 port (which was in turn taken from Dave Mills's kern.tar.Z example). A few significant differences:
1) ntp_gettime() is now a MIB variable rather than a system call. A few fiddles are done in libc to make it behave the same.
2) mono_time does not participate in the PLL adjustments.
3) A new interface has been defined (in <machine/clock.h>) for doing possibly machine-dependent things around the time of the clock update. This is used in Pentium kernels to disable interrupts, set `time', and reset the CPU cycle counter as quickly as possible to avoid jitter in microtime(). Measurements show an apparent resolution of a bit more than 8.14usec, which is reasonable given system-call overhead.
|
#
2320 |
|
27-Aug-1994 |
dg |
1) Changed ddb into a option rather than a pseudo-device (use options DDB in your kernel config now). 2) Added ps ddb function from 1.1.5. Cleaned it up a bit and moved into its own file. 3) Added \r handing in db_printf. 4) Added missing memory usage stats to statclock(). 5) Added dummy function to pseudo_set so it will be emitted if there are no other pseudo declarations.
|
#
2112 |
|
18-Aug-1994 |
wollman |
Fix up some sloppy coding practices:
- Delete redundant declarations. - Add -Wredundant-declarations to Makefile.i386 so they don't come back. - Delete sloppy COMMON-style declarations of uninitialized data in header files. - Add a few prototypes. - Clean up warnings resulting from the above.
NB: ioconf.c will still generate a redundant-declaration warning, which is unavoidable unless somebody volunteers to make `config' smarter.
|
#
1817 |
|
02-Aug-1994 |
dg |
Added $Id$
|
#
1549 |
|
25-May-1994 |
rgrimes |
The big 4.4BSD Lite to FreeBSD 2.0.0 (Development) patch.
Reviewed by: Rodney W. Grimes Submitted by: John Dyson and David Greenman
|
#
1541 |
|
24-May-1994 |
rgrimes |
BSD 4.4 Lite Kernel Sources
|