#
267654 |
|
19-Jun-2014 |
gjb |
Copy stable/9 to releng/9.3 as part of the 9.3-RELEASE cycle.
Approved by: re (implicit) Sponsored by: The FreeBSD Foundation |
#
248085 |
|
09-Mar-2013 |
marius |
MFC: r227309 (partial)
Mark all SYSCTL_NODEs static that have no corresponding SYSCTL_DECLs.
The SYSCTL_NODE macro defines a list that stores all child-elements of that node. If there's no SYSCTL_DECL macro anywhere else, there's no reason why it shouldn't be static.
|
#
235515 |
|
16-May-2012 |
jhb |
MFC 233709,233781,233793: - Don't malloc() new MCA records for machine checks logged due to a CMCI or MC# exception. Instead, use a pre-allocated pool of records. When a CMCI or MC# exception fires, schedule a task to refill the pool. The pool is sized to hold at least one record per available machine bank, and one record per CPU. This should handle the case of all CPUs triggering a single bank at once as well as the case a single CPU triggering all of its banks. The periodic scans still use malloc() since they are run from a safe context. - Make machine check exception logging more readable. On newer Intel systems, an uncorrected ECC error tends to fire on all CPUs in a package simultaneously and the current printf hacks are not sufficient to make the messages legible. Instead, use the existing mca_lock spinlock to serialize calls to mca_log() and change the machine check code to panic directly when an unrecoverable error is encoutered rather than falling back to a trap_fatal() call in trap() (which adds nearly a screen-full of logging messages that aren't useful for machine checks).
|
#
225736 |
|
22-Sep-2011 |
kensmith |
Copy head to stable/9 as part of 9.0-RELEASE release cycle.
Approved by: re (implicit)
|
#
218221 |
|
03-Feb-2011 |
jhb |
Use a dedicated taskqueue with a thread that runs at a software-interrupt priority for the periodic polling of the machine check registers.
|
#
214630 |
|
01-Nov-2010 |
jhb |
Move the <machine/mca.h> header to <x86/mca.h>.
|
#
210577 |
|
28-Jul-2010 |
jhb |
The corrected error count field is dependent on CMCI, not TES.
MFC after: 1 week
|
#
209212 |
|
15-Jun-2010 |
jhb |
Restore the machine check register banks on resume. For banks being monitored via CMCI, reset the interrupt threshold to 1 on resume.
Reviewed by: jkim MFC after: 2 weeks
|
#
209059 |
|
11-Jun-2010 |
jhb |
Update several places that iterate over CPUs to use CPU_FOREACH().
|
#
208921 |
|
08-Jun-2010 |
jhb |
Move the machine check support code to the x86 tree since it is identical on i386 and amd64.
Requested by: alc
|
#
208621 |
|
28-May-2010 |
jhb |
Defer initializing machine checks for the boot CPU until the local APIC is fully configured.
MFC after: 1 month
|
#
208556 |
|
25-May-2010 |
jhb |
Only enable CMCI on i386 if 'device apic' is enabled in the kernel since it requires the local APIC to work.
|
#
208507 |
|
24-May-2010 |
jhb |
Add support for corrected machine check interrupts. CMCI is a new local APIC interrupt that fires when a threshold of corrected machine check events is reached. CMCI also includes a count of events when reporting corrected errors in the bank's status register. Note that individual banks may or may not support CMCI. If they do, each bank includes its own threshold register that determines when the interrupt fires. Currently the code uses a very simple strategy where it doubles the threshold on each interrupt until it succeeds in throttling the interrupt to occur only once a minute (this interval can be tuned via sysctl). The threshold is also adjusted on each hourly poll which will lower the threshold once events stop occurring.
Tested by: Sailaja Bangaru sbappana at yahoo com MFC after: 1 month
|
#
205573 |
|
24-Mar-2010 |
alc |
Adapt r204907 and r205402, the amd64 implementation of the workaround for AMD Family 10h Erratum 383, to i386.
Enable machine check exceptions by default, just like r204913 for amd64.
Enable superpage promotion only if the processor actually supports large pages, i.e., PG_PS.
MFC after: 2 weeks
|
#
205214 |
|
16-Mar-2010 |
jhb |
- Extend the machine check record structure to include several fields useful for parsing model-specific and other fields in machine check events including the global machine check capabilities and status registers, CPU identification, and the FreeBSD CPU ID. - Report these added fields in the console log of a machine check so that a record structure can be reconstituted from the console messages. - Parse new architectural errors including memory controller errors.
MFC after: 1 week
|
#
204518 |
|
01-Mar-2010 |
jhb |
Print the contents of the miscellaneous (MISC) register to the console if it is valid along with the other register values when a machine check is encountered.
MFC after: 1 week
|
#
200064 |
|
03-Dec-2009 |
avg |
mca: small enhancements related to cpu quirks
- use utility macros for CPU family/model checking - limit Intel P6 quirk to pre-Nehalem models (taken from OpenSolaris) - add AMD GartTblWkEn quirk for families 0Fh and 10h; I haven't experienced any problems without the quirk but both Linux and OpenSolaris do this - slightly re-arrange quirk code to provide for the future generalization and separation of vendor-specific quirk functions
Reviewed by: jhb MFC after: 1 week
|
#
200033 |
|
02-Dec-2009 |
avg |
mca: improve status checking, recording and reporting
- directly print mca information in case we fail to allocate memory for a record - include bank number into mca record - print raw mca status value for extended information
Reviewed by: jhb MFC after: 10 days
|
#
192440 |
|
20-May-2009 |
jhb |
Don't bother reading the initial value of the machine check banks during startup on Pentium 4 CPUs. This wasn't safe to do on APs during AP startup, was of limited value, and won't be used for future processors.
|
#
192343 |
|
18-May-2009 |
jhb |
- Add a tunable 'hw.mca.enabled' that can be used to enable/disable the machine check code. Disable it by default for now. - When computing the mask of bits that determines a non-restartable event during a machine check exception, or-in the overflow flag rather than replacing the other flags.
PR: i386/134586 [2] Submitted by: Andi Kleen andi-fbsd firstfloor.org
|
#
192050 |
|
13-May-2009 |
jhb |
Implement simple machine check support for amd64 and i386. - For CPUs that only support MCE (the machine check exception) but not MCA (i.e. Pentium), all this does is print out the value of the machine check registers and then panic when a machine check exception occurs. - For CPUs that support MCA (the machine check architecture), the support is a bit more involved. - First, there is limited support for decoding the CPU-independent MCA error codes in the kernel, and the kernel uses this to output a short description of any machine check events that occur. - When a machine check exception occurs, all of the MCx banks on the current CPU are scanned and any events are reported to the console before panic'ing. - To catch events for correctable errors, a periodic timer kicks off a task which scans the MCx banks on all CPUs. The frequency of these checks is controlled via the "hw.mca.interval" sysctl. - Userland can request an immediate scan of the MCx banks by writing a non-zero value to "hw.mca.force_scan". - If any correctable events are encountered, the appropriate details are stored in a 'struct mca_record' (defined in <machine/mca.h>). The "hw.mca.count" is a count of such records and each record may be queried via the "hw.mca.records" tree by specifying the record index (0 .. count - 1) as the next name in the MIB similar to using PIDs with the kern.proc.* sysctls. The idea is to export machine check events to userland for more detailed processing. - The periodic timer and hw.mca sysctls are only present if the CPU supports MCA.
Discussed with: emaste (briefly) MFC after: 1 month
|