History log of /freebsd-current/sys/dev/acpica/acpi_cpu.c
Revision Date Author Comments
# 5bc10fea 25-Dec-2023 Alexander Motin <mav@FreeBSD.org>

acpi_cpu: Reduce BUS_MASTER_RLD manipulations

Instead of setting and clearing BUS_MASTER_RLD register on every C3
state enter/exit, set it only once if the system supports C3 state
and we are going to "disable" bus master arbitration while in it.

This is what Linux does for the past 14 years, and for even more time
this register is not implemented in a relevant hardware. Same time
since this is only a single bit in a bigger register, ACPI has to
do take a global lock and do read-modify-write for it, that is too
expensive, saved only by C3 not entered frequently, but enough to be
seen in idle system CPU profiles.

MFC after: 1 month


# 685dc743 16-Aug-2023 Warner Losh <imp@FreeBSD.org>

sys: Remove $FreeBSD$: one-line .c pattern

Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/


# 15bd2f36 23-Oct-2022 Tom Jones <thj@FreeBSD.org>

acpi: Create cppc_notify sysctl before it is checked

Reported by: Henrix
Reviewed by: jhb
Differential Revision: https://reviews.freebsd.org/D37081


# eee0f7ae 11-Oct-2022 Tom Jones <thj@FreeBSD.org>

acpi: Put CPPC workaround behind i386/amd64 if def

While CPPC is available on arm64 platforms with ACPI we don't know if we
need to work around issues with firmware there.


# 67f2a563 10-Oct-2022 Tom Jones <thj@FreeBSD.org>

acpi: Tell SMM we will handle CPPC notifications

Buggy SMM implementations can hang while processing CPPC notifications.
This leads to some laptops (notably Thinkpads) hanging when the
hwpstate_intel driver is loaded.

Tell the SMM that we will handle CPPC notifications as described in:

- Intel® Processor Vendor-Specific ACPI
- Intel® 64 and IA-32 Architectures Software Developer’s Manual

CPPC events default to masked (disabled) so while we do not do any
handling right now this does not seem to lead to any issues.

This approach was found via this Linux Kernel patch:
https://lkml.org/lkml/2016/3/17/563

PR: 253288
Reviewed by: imp, jhb
Sponsored by: Modirum
Sponsored by: Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D36699


# 916a5d8a 19-Apr-2022 John Baldwin <jhb@FreeBSD.org>

acpi: Remove unused devclass arguments to DRIVER_MODULE.


# e206dddc 21-Apr-2022 John Baldwin <jhb@FreeBSD.org>

acpi_cpu: Use device_get_devclass to find devclass in attach.

Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D34988


# b93f47ea 17-Mar-2022 Roger Pau Monné <royger@FreeBSD.org>

xen/acpi: upload Cx and Px data to Xen

When FreeBSD is running as dom0 (initial domain) on a Xen system it
has access to the native ACPI tables and is the OSPM. However the
hypervisor is the entity in charge of the CPU idle and frequency
states, and in order to perform this duty it requires information
found the ACPI dynamic tables that can only be parsed by the OSPM.

Introduce a new Xen specific ACPI driver to fetch the Processor
related information and upload it to Xen. Note that this driver needs
to take precedence over the generic ACPI CPU driver when running as
dom0, so downgrade the probe score of the native driver to
BUS_PROBE_DEFAULT in order for the Xen specific driver to use
BUS_PROBE_SPECIFIC.

Tested on an Intel NUC to successfully parse and upload both the Cx and
Px states to Xen.

Sponsored by: Citrix Systems R&D
Reviewed by: jhb kib
Differential revision: https://reviews.freebsd.org/D34841


# 3e68d2c5 26-Dec-2021 Alexander Motin <mav@FreeBSD.org>

acpica: Remove CTLFLAG_NEEDGIANT from most sysctls.

MFC after: 2 weeks


# de291c5d 09-Dec-2021 Alexander Motin <mav@FreeBSD.org>

acpi_cpu: Replace Giant with bus_topo_lock.


# 4e50efb1 26-Sep-2021 Andrew Turner <andrew@FreeBSD.org>

Check cpu_softc is not NULL before dereferencing

In the acpi_cpu_postattach SYSINIT function cpu_softc may be NULL, e.g.
on arm64 when booting from FDT. Check it is not NULL at the start of
the function so we don't try to dereference a NULL pointer.

Sponsored by: The FreeBSD Foundation


# 695323ae 25-Sep-2021 Alexander Motin <mav@FreeBSD.org>

acpi_cpu: Fix panic if some CPU devices are disabled.

While there, remove couple unneeded global variables.


# c8077ccd 24-Sep-2021 Alexander Motin <mav@FreeBSD.org>

acpi_cpu: Make device unit numbers match OS CPU IDs.

There are already APIC ID, ACPI ID and OS ID for each CPU. In perfect
world all of those may match, but at least for SuperMicro server boards
none of them do. Plus none of them match the CPU devices listing order
by ACPI. Previous code used the ACPI device listing order to number
cpuX devices. It looked nice from NewBus perspective, but introduced
4th different set of IDs. Extremely confusing one, since in some places
the device unit numbers were treated as OS CPU IDs (coretemp), but not
in others (sysctl dev.cpu.X.%location).


# 2cee045b 10-Mar-2021 Alexander Motin <mav@FreeBSD.org>

Move time math out of disabled interrupts sections.

We don't need the result before next sleep time, so no reason to
additionally increase interrupt latency.

While there, remove extra PM ticks to microseconds conversion, making
C2/C3 sleep times look 4 times smaller than really. The conversion
is already done by AcpiGetTimerDuration(). Now I see reported sleep
times up to 0.5s, just as expected for planned 2 wakeups per second.

MFC after: 1 month


# 075e4807 08-Mar-2021 Alexander Motin <mav@FreeBSD.org>

Do not read timer extra time when MWAIT is used.

When we enter C2+ state via memory read, it may take chipset some
time to stop CPU. Extra register read covers that time. But MWAIT
makes CPU stop immediately, so we don't need to waste time after
wakeup with interrupts still disabled, increasing latency.

On my system it reduces ping localhost latency, waking up all CPUs
once a second, from 277us to 242us.

MFC after: 1 month


# 455219675 08-Mar-2021 Alexander Motin <mav@FreeBSD.org>

Change mwait_bm_avoidance use to match Linux.

Even though the information is very limited, it seems the intent of
this flag is to control ACPI_BITREG_BUS_MASTER_STATUS use for C3,
not force ACPI_BITREG_ARB_DISABLE manipulations for C2, where it was
never needed, and which register not really doing anything for years.
It wasted lots of CPU time on congested global ACPI hardware lock
when many CPU cores were trying to enter/exit deep C-states same time.

On idle 80-core system it pushed ping localhost latency up to 20ms,
since badport_bandlim() via counter_ratecheck() wakes up all CPUs
same time once a second just to synchronously reset the counters.
Now enabling C-states increases the latency from 0.1 to just 0.25ms.

Discussed with: kib
MFC after: 1 month


# 82c28121 01-Sep-2020 Mateusz Guzik <mjg@FreeBSD.org>

acpica: clean up empty lines in .c and .h files


# 7029da5c 26-Feb-2020 Pawel Biernacki <kaktus@FreeBSD.org>

Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)

r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE. All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by: kib (mentor, blanket)
Commented by: kib, gallatin, melifaro
Differential Revision: https://reviews.freebsd.org/D23718


# 5efca36f 25-Oct-2018 Takanori Watanabe <takawata@FreeBSD.org>

Distinguish _CID match and _HID match and make lower priority probe
when _CID match.

Reviewed by: jhb, imp
Differential Revision:https://reviews.freebsd.org/D16468


# 43d9cb5b 07-May-2018 Warner Losh <imp@FreeBSD.org>

Use device_quiet_children to silence verbose CPU probe messages.

Have cpu0 be noisy, but all the other CPU devices be quiet on boot.


# e054cac7 18-Dec-2017 Conrad Meyer <cem@FreeBSD.org>

Implement ACPI CPU support when Processor object is not present

By the ACPI standard (ACPI 5 chapter 8.4 Declaring Processors) Processors
can be implemented in 2 distinct ways:

* Through a Processor object type (which provides P_BLK)
* Through a Device object type

Prior to this change, the FreeBSD driver only supported the former. AMD
Epyc / Poweredge systems we are testing both implement the latter only. Add
the missing support.

Because P_BLK is not defined in the device object case, C-states entering
must be completely controlled via _CST methods rather than P_LVL2/3.

John Baldwin points out that ACPI 6.0 formally deprecates the Processor
keyword, so eventually processors will only be enumerated as Device objects.

Submitted by: attilio
Reviewed by: jhb, markj, Anton Rang <rang AT acm.org>
Relnotes: maybe
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D13457


# d7bbccdd 30-Sep-2017 Jung-uk Kim <jkim@FreeBSD.org>

Revert r324109. This commit broke a number of systems.

Reported by: lwhsu, kib
Requested by: ngie


# 67d955aa 08-Apr-2017 Patrick Kelsey <pkelsey@FreeBSD.org>

Corrected misspelled versions of rendezvous.

The MFC will include a compat definition of smp_no_rendevous_barrier()
that calls smp_no_rendezvous_barrier().

Reviewed by: gnn, kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D10313


# 8cd59625 24-Feb-2017 Konstantin Belousov <kib@FreeBSD.org>

Remove cpu_deepest_sleep variable.

On Core2 and older Intel CPUs, where TSC stops in C2, system does not
allow C2 entrance if timecounter hardware is TSC. This is done by
tc_windup() which tests for TC_FLAGS_C2STOP flag of the new
timecounter and increases cpu_disable_c2_sleep if flag is set. Right
now init_TSC_tc() only sets the flag if cpu_deepest_sleep >= 2, but
TSC is initialized too early for this variable to be set by
acpi_cpu.c.

There is no reason to require that ACPI reported C2 and deeper states
to set TC_FLAGS_C2STOP, so remove cpu_deepest_sleep test from
init_TSC_tc() condition. And since this is the only use of the
variable, remove it at all.

Reported and submitted by: Jia-Shiun Li <jiashiun@gmail.com>
Suggested by: jhb
MFC after: 2 weeks


# 19d4720b 08-Feb-2017 Jonathan T. Looney <jtl@FreeBSD.org>

Ensure the idle thread's loop services interrupts in a timely way when
using the ACPI C1/mwait sleep method.

Previously, the mwait instruction would return when an interrupt was
pending; however, the idle loop did not actually enable interrupts when
this occurred. This led to a situation where the idle loop could quickly
spin through the C1/mwait sleep method a number of times when an interrupt
was pending. (Eventually, the situation corrected itself when something
other than an interrupt triggered the idle loop to either enable interrupts
or schedule another thread.)

Reviewed by: kib, imp (earlier version)
Input from: jhb
MFC after: 1 week
Sponsored by: Netflix


# fdce57a0 14-May-2016 John Baldwin <jhb@FreeBSD.org>

Add an EARLY_AP_STARTUP option to start APs earlier during boot.

Currently, Application Processors (non-boot CPUs) are started by
MD code at SI_SUB_CPU, but they are kept waiting in a "pen" until
SI_SUB_SMP at which point they are released to run kernel threads.
SI_SUB_SMP is one of the last SYSINIT levels, so APs don't enter
the scheduler and start running threads until fairly late in the
boot.

This change moves SI_SUB_SMP up to just before software interrupt
threads are created allowing the APs to start executing kernel
threads much sooner (before any devices are probed). This allows
several initialization routines that need to perform initialization
on all CPUs to now perform that initialization in one step rather
than having to defer the AP initialization to a second SYSINIT run
at SI_SUB_SMP. It also permits all CPUs to be available for
handling interrupts before any devices are probed.

This last feature fixes a problem on with interrupt vector exhaustion.
Specifically, in the old model all device interrupts were routed
onto the boot CPU during boot. Later after the APs were released at
SI_SUB_SMP, interrupts were redistributed across all CPUs.

However, several drivers for multiqueue hardware allocate N interrupts
per CPU in the system. In a system with many CPUs, just a few drivers
doing this could exhaust the available pool of interrupt vectors on
the boot CPU as each driver was allocating N * mp_ncpu vectors on the
boot CPU. Now, drivers will allocate interrupts on their desired CPUs
during boot meaning that only N interrupts are allocated from the boot
CPU instead of N * mp_ncpu.

Some other bits of code can also be simplified as smp_started is
now true much earlier and will now always be true for these bits of
code. This removes the need to treat the single-CPU boot environment
as a special case.

As a transition aid, the new behavior is available under a new kernel
option (EARLY_AP_STARTUP). This will allow the option to be turned off
if need be during initial testing. I plan to enable this on x86 by
default in a followup commit in the next few days and to have all
platforms moved over before 11.0. Once the transition is complete,
the option will be removed along with the !EARLY_AP_STARTUP code.

These changes have only been tested on x86. Other platform maintainers
are encouraged to port their architectures over as well. The main
things to check for are any uses of smp_started in MD code that can be
simplified and SI_SUB_SMP SYSINITs in MD code that can be removed in
the EARLY_AP_STARTUP case (e.g. the interrupt shuffling).

PR: kern/199321
Reviewed by: markj, gnn, kib
Sponsored by: Netflix


# 453130d9 02-May-2016 Pedro F. Giffuni <pfg@FreeBSD.org>

sys/dev: minor spelling fixes.

Most affect comments, very few have user-visible effects.


# 8d07a66d 28-Apr-2016 John Baldwin <jhb@FreeBSD.org>

Only count CPU devices that are using the ACPI CPU driver.

Arguably we should only be doing the probe/attach to children of
these devices as well.

Tested by: Michal Stanek <mst_semihalf.com> (arm64)
Differential Revision: https://reviews.freebsd.org/D6133


# 4c26ac69 22-Apr-2016 John Baldwin <jhb@FreeBSD.org>

Optionally return the output capabilities list from _OSC.

Both of the callers were expecting the input cap_set to be modified.
This fixes them to request cap_set to be updated with the returned buffer.

Reviewed by: jkim
Differential Revision: https://reviews.freebsd.org/D6040


# f8887b89 21-Apr-2016 John Baldwin <jhb@FreeBSD.org>

Queue the CPU-probing task after all acpi_cpu devices are attached.

Eventually with earlier AP startup this code will change to call the
startup function synchronously instead of queueing the task. Moving
the time we queue the task should be a no-op since taskqueue threads
don't start executing tasks until much later, but this reduces the diff
with the earlier AP startup patches.

Sponsored by: Netflix


# 87f0a4bf 20-Apr-2016 Jung-uk Kim <jkim@FreeBSD.org>

There is no need to use array any more. No functional change.


# cad6d222 20-Apr-2016 Jung-uk Kim <jkim@FreeBSD.org>

Remove query flag from acpi_EvaluateOSC(). This function does not support
return buffer (yet).


# 5f3dd91a 20-Apr-2016 John Baldwin <jhb@FreeBSD.org>

Add a wrapper for evaluating _OSC methods.

This wrapper does not translate errors in the first word to ACPI
error status returns. Use this wrapper in the acpi_cpu(4) driver in
place of the existing _OSC code. While here, fix a bug where the wrong
count of words was passed when invoking _OSC.

Reviewed by: jkim
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D6022


# 617994ef 11-Jun-2015 Andrew Turner <andrew@FreeBSD.org>

Add basic support for ACPI. It splits out the nexus driver to two new
drivers, one for fdt, one for acpi. It then uses this to decide if it will
use fdt or acpi.

The GICv2 (interrupt controller) and Generic Timer drivers have been
updated to handle both cases.

As this is early code we still need FDT to find the kernel console, and
some parts are still missing, including PCI support.

Differential Revision: https://reviews.freebsd.org/D2463
Reviewed by: jhb, jkim, emaste
Obtained from: ABT Systems Ltd
Relnotes: Yes
Sponsored by: The FreeBSD Foundation


# 9ae8e006 09-Jun-2015 Jung-uk Kim <jkim@FreeBSD.org>

Check status of AcpiReadBitRegister() calls.

Reported by: Coverity
CID: 1306132


# 9cf4cabe 21-May-2015 Jung-uk Kim <jkim@FreeBSD.org>

Do not probe Intel PIIX4 south bridge quirks on amd64. These quirky south
bridges only supported Intel Pentium and Pentium II era processors and there
is no reason for hardware virtualizations to emulate these quirks.

MFC after: 1 week


# 044a49cd 11-May-2015 Andrew Turner <andrew@FreeBSD.org>

Hide code only used on i386 and amd64.


# b57a73f8 08-May-2015 Konstantin Belousov <kib@FreeBSD.org>

If x86 CPU implementation of the MWAIT instruction reasonably
interacts with interrupts, query ACPI and use MWAIT for entrance into
Cx sleep states. Support C1 "I/O then halt" mode. See Intel'
document 302223-007 "Intelб╝ Processor Vendor-Specific ACPI Interface
Specification" for description.

Move the acpi_cpu_c1() function into x86/cpu_machdep.c and use
it instead of inlining "sti; hlt" sequence in several places.

In the acpi(4) man page, besides documenting the dev.cpu.N.cx_methods
sysctl, correct the names for dev.cpu.N.{cx_usage,cx_lowest,cx_supported}
sysctls.

Both jkim and avg have some other patches implementing the mwait
functionality; this work is unrelated. Linux does not rely on the
ACPI to provide correct tables describing Cx modes. Instead, the
driver has pre-defined knowledge of the CPU models, it was supplied by
Intel.

Tested by: pho (previous versions)
Sponsored by: The FreeBSD Foundation


# 633a2847 17-Jan-2015 Colin Percival <cperciva@FreeBSD.org>

When disabling C3+ CPU states due to the CPU_QUIRK_NO_C3 quirk, don't
accidentally enable non-existent states.

This bug was triggered if ACPI advertises the presence of a C2 state
which we fail to parse via acpi_PkgGas due to our lack of support for
FFixedHW resources, and causes an immediate panic when an attempt is
made to enter the (NULL) state.

One affected platform is the EC2 c4.8xlarge VM instance type; there
may be others.

MFC after: 1 week
Thanks to: jkim, @_msw_


# 92597e06 05-Jan-2015 John Baldwin <jhb@FreeBSD.org>

On some Intel CPUs with a P-state but not C-state invariant TSC the TSC
may also halt in C2 and not just C3 (it seems that in some cases the BIOS
advertises its C3 state as a C2 state in _CST). Just play it safe and
disable both C2 and C3 states if a user forces the use of the TSC as the
timecounter on such CPUs.

PR: 192316
Differential Revision: https://reviews.freebsd.org/D1441
No objection from: jkim
MFC after: 1 week


# c2641d23 04-Aug-2014 Roger Pau Monné <royger@FreeBSD.org>

xen: add ACPI bus to xen_nexus when running as Dom0

Also disable a couple of ACPI devices that are not usable under Dom0.
To this end a couple of booleans are added that allow disabling ACPI
specific devices.

Sponsored by: Citrix Systems R&D
Reviewed by: jhb

x86/xen/xen_nexus.c:
- Return BUS_PROBE_SPECIFIC in the Xen Nexus attachement routine to
force the usage of the Xen Nexus.
- Attach the ACPI bus when running as Dom0.

dev/acpica/acpi_cpu.c:
dev/acpica/acpi_hpet.c:
dev/acpica/acpi_timer.c
- Add a variable that gates the addition of the devices.

x86/include/init.h:
- Declare variables that control the attachment of ACPI cpu, hpet and
timer devices.


# af3b2549 27-Jun-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Pull in r267961 and r267973 again. Fix for issues reported will follow.


# 37a107a4 27-Jun-2014 Glen Barber <gjb@FreeBSD.org>

Revert r267961, r267973:

These changes prevent sysctl(8) from returning proper output,
such as:

1) no output from sysctl(8)
2) erroneously returning ENOMEM with tools like truss(1)
or uname(1)
truss: can not get etype: Cannot allocate memory


# 3da1cf1e 27-Jun-2014 Hans Petter Selasky <hselasky@FreeBSD.org>

Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after: 2 weeks
Sponsored by: Mellanox Technologies


# 0d470054 07-Apr-2014 Adrian Chadd <adrian@FreeBSD.org>

Add a basic set of data points which count the number of sleep entries
that are being done by the OS.

For now this'll match up with the "wakeups"; although I'll dig deeper into
this to see if we can determine which sleep state the CPU managed to get
into. Most things I've seen these days only expose up to C2 or C3 via
ACPI even though the CPU goes all the way down to C6 or C7.


# d9ddf0c0 28-Feb-2013 Davide Italiano <davide@FreeBSD.org>

MFcalloutng (r247427 by mav):
We don't need any precision here. Let it be fast and dirty shift then
slow and excessively precise 64-bit division.


# acccf7d8 28-Feb-2013 Davide Italiano <davide@FreeBSD.org>

MFcalloutng:
When CPU becomes idle, cpu_idleclock() calculates time to the next timer
event in order to reprogram hw timer. Return that time in sbintime_t to
the caller and pass it to acpi_cpu_idle(), where it can be used as one
more factor (quite precise) to extimate furter sleep time and choose
optimal sleep state. This is a preparatory change for further callout
improvements will be committed in the next days.

The commmit is not targeted for MFC.


# 30bf6110 01-Dec-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu_notify: disable acpi_cpu_idle while updating C-state data

... to avoid any races or inconsistencies.
This should fix a regression introduced in r243404.

Also, remove a stale comment that has not been true for quite a while
now.

Pointyhat to: avg
Teested by: trociny, emaste, dumbbell (earlier version)
MFC after: 1 week


# 09424d43 01-Dec-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: change cpu_disable_idle to be a per-cpu flag...

and make it safe to manipulate and check the flag

With help from: jhb
Tested by: trociny, emaste, dumbbell
MFC after: 1 week


# f51d43fe 22-Nov-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: use fixed resource ids for cx state i/o resources

... instead of the ever increasing ones.
Also, do free old resources when allocating new ones when cx states
change.

Tested by: Tom Lislegaard <Tom.Lislegaard@proact.no>
Obtained from: jkim
MFC after: 1 week


# 154fc7b6 18-Sep-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: explicitly notify userland about c-state changes

... after they are committed.
A notification is sent per CPU.

Reviewed by: imp
MFC after: 3 weeks


# 31482433 11-Sep-2012 Andriy Gapon <avg@FreeBSD.org>

revert r240344: cpu_devices[] is used in other functions and must be kept

Reported by: gjb, glebius
Pointyhat to: avg
MFC after: 1 day
X-MFC note: fake MFC, reminder to never MFC r240344


# ccb92b3c 11-Sep-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: free result of device_get_children

MFC after: 1 week


# 3c5c5559 31-Jul-2012 Alexander Motin <mav@FreeBSD.org>

Add several performance optimizations to acpi_cpu_idle().

For C1 and C2 states use cpu_ticks() to measure sleep time instead of much
slower ACPI timer. We can't do it for C3, as TSC may stop there. But it is
less important there as wake up latency is high any way.

For C1 and C2 states do not check/clear bus mastering activity status, as
it is important only for C3. As side effect it can make CPU enter C2 instead
of C3 if last BM activity was two sleeps back (unlike one before), but
that may be even good because of collecting more statistics. Premature BM
wakeup from C3, entered because of overestimation, can easily be worse then
entering C2 from both performance and power consumption points of view.

Together on dual Xeon E5645 system on sequential 512 bytes read test this
change makes cpu_idle_acpi() as fast as simplest cpu_idle_hlt() and only
few percents slower then cpu_idle_mwait(), while deeper states are still
actively used during idle periods.

To help with diagnostics, add C-state type into dev.cpu.X.cx_supported.

Sponsored by: iXsystems, Inc.


# d30b88af 13-Jul-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: separate a notion of current deepest allowed+available Cx level

... from a user-set persistent limit on the said level.
Allow to set the user-imposed limit below current deepest available level
as the available levels may be dynamically changed by ACPI platform
in both directions.
Allow "Cmax" as an input value for cx_lowest sysctls to mean that there
is not limit and OS can use all available C-states.
Retire global cpu_cx_count as it no longer serves any meaningful
purpose.

Reviewed by: jhb, gianni, sbruno
Tested by: sbruno, Vitaly Magerya <vmagerya@gmail.com>
MFC after: 2 weeks


# 029468d8 08-Jul-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: we are able to handle _CST change notifications...

so un-ifdef code that is supposed to tell ACPI platform about that

Tested by: Taku YAMAMOTO <taku@tackymt.homeip.net>
MFC after: 2 weeks


# 987e5277 07-Jul-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu_generic_cx_probe: for consistency set cpu_non_c3 here too

although by default only C1 is enabled (cx_lowest=0) and enabling deeper
states goes through acpi_cpu_set_cx_lowest which re-evaluates cpu_non_c3

MFC after: 2 weeks


# 412ef220 07-Jul-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu_cx_list: there is no need to re-evaluate cpu_non_c3 here

cpu_non_c3 is already evaluated in acpi_cpu_cx_cst and in
acpi_cpu_set_cx_lowest.
Besides acpi_cpu_cx_list is not protected by any locking.

As a result also move setting of cpu_can_deep_sleep to more appropriate
places.

MFC after: 2 weeks


# 56101001 07-Jul-2012 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu_cx_cst: consistently use cpu_cx_count during state enumeration

cpu_cx_count is an index into accepted states, while i is an index into
original _CST states

MFC after: 1 week


# 55fb7f36 02-Jul-2012 Sean Bruno <sbruno@FreeBSD.org>

Revert r238004 as more review has come in and there is now a discussion
on how to best proceed.


# 7402aad3 02-Jul-2012 Sean Bruno <sbruno@FreeBSD.org>

Cosmetic display change of Cx states via cx_supported sysctl entries.

Adjust power_profile script to handle the new world order as well.

Some vendors are opting out of a C2 state and only defining C1 & C3. This
leads the acpi_cpu display to indicate that the machine supports C1 & C2
which is caused by the (mis)use of the index of the cx_state array as the
ACPI_STATE_CX value.

e.g. the code was pretending that cx_state[i] would
always convert to i by subtracting 1.

cx_state[2] == ACPI_STATE_C3
cx_state[1] == ACPI_STATE_C2
cx_state[0] == ACPI_STATE_C1

however, on certain machines this would lead to
cx_state[1] == ACPI_STATE_C3
cx_state[0] == ACPI_STATE_C1

This didn't break anything but led to a display of:
* dev.cpu.0.cx_supported: C1/1 C2/96

Instead of
* dev.cpu.0.cx_supported: C1/1 C3/96

MFC after: 2 weeks


# 2aa7a9e6 23-May-2012 Jung-uk Kim <jkim@FreeBSD.org>

Restore Processor object path for verbose boot message.


# e4cd9dcf 23-May-2012 John Baldwin <jhb@FreeBSD.org>

Rework the previous change to honor MADT processor IDs when probing
processor objects. Instead of forcing the new-bus CPU objects to use
a unit number equal to pc_cpuid, adjust acpi_pcpu_get_id() to honor the
MADT IDs by default. As with the previous change, setting
debug.acpi.cpu_unordered to 1 in the loader will revert to the old
behavior.

Tested by: jimharris
MFC after: 1 month


# 4b7ec270 22-Nov-2011 Marius Strobl <marius@FreeBSD.org>

- There's no need to overwrite the default device method with the default
one. Interestingly, these are actually the default for quite some time
(bus_generic_driver_added(9) since r52045 and bus_generic_print_child(9)
since r52045) but even recently added device drivers do this unnecessarily.
Discussed with: jhb, marcel
- While at it, use DEVMETHOD_END.
Discussed with: jhb
- Also while at it, use __FBSDID.


# 6d064c97 24-Jun-2011 Marcel Moolenaar <marcel@FreeBSD.org>

Now that ia64 has been switched to the event timers, remove the
conditional compilation work-arounds.


# 624a5cc8 22-Jun-2011 Jung-uk Kim <jkim@FreeBSD.org>

Fix build on ia64 after r223426.


# a49399a9 22-Jun-2011 Jung-uk Kim <jkim@FreeBSD.org>

Set negative quality to TSC timecounter when C3 state is enabled for Intel
processors unless the invariant TSC bit of CPUID is set. Intel processors
may stop incrementing TSC when DPSLP# pin is asserted, according to Intel
processor manuals, i. e., TSC timecounter is useless if the processor can
enter deep sleep state (C3/C4). This problem was accidentally uncovered by
r222869, which increased timecounter quality of P-state invariant TSC, e.g.,
for Core2 Duo T5870 (Family 6, Model f) and Atom N270 (Family 6, Model 1c).

Reported by: Fabian Keil (freebsd-listen at fabiankeil dot de)
Ian FREISLICH (ianf at clue dot co dot za)
Tested by: Fabian Keil (freebsd-listen at fabiankeil dot de)
- Core2 Duo T5870 (C3 state available/enabled)
jkim - Xeon X5150 (C3 state unavailable)


# 3453537f 07-Apr-2011 Jung-uk Kim <jkim@FreeBSD.org>

Use atomic load & store for TSC frequency. It may be overkill for amd64 but
safer for i386 because it can be easily over 4 GHz now. More worse, it can
be easily changed by user with 'machdep.tsc_freq' tunable (directly) or
cpufreq(4) (indirectly). Note it is intentionally not used in performance
critical paths to avoid performance regression (but we should, in theory).
Alternatively, we may add "virtual TSC" with lower frequency if maximum
frequency overflows 32 bits (and ignore possible incoherency as we do now).


# e1c9d39e 14-Dec-2010 Jung-uk Kim <jkim@FreeBSD.org>

Stop lying about supporting cpu_est_clockrate() when TSC is invariant. This
function always returned the nominal frequency instead of current frequency
because we use RDTSC instruction to calculate difference in CPU ticks, which
is supposedly constant for the case. Now we support cpu_get_nominal_mhz()
for the case, instead. Note it should be just enough for most usage cases
because cpu_est_clockrate() is often times abused to find maximum frequency
of the processor.


# 68d5e11c 12-Nov-2010 Jung-uk Kim <jkim@FreeBSD.org>

Create C1 state when _CST is valid but _CST does not have one. Some BIOSes
do not report C1 state in _CST object, probably because it is a mandatory
state with or without existence of the optional _CST.

Reviewed by: avg


# a7d5f7eb 19-Oct-2010 Jamie Gritton <jamie@FreeBSD.org>

A new jail(8) with a configuration file, to replace the work currently done
by /etc/rc.d/jail.


# 48fe2e67 22-Sep-2010 Alexander Motin <mav@FreeBSD.org>

Quick fix for unmotivated C2 state usage during boot, introduced at r212541.
That caused LAPIC timer failure and huge delays during boot on some systems.


# 09c22c66 13-Sep-2010 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: do not apply P_LVLx_LAT rules to latencies returned by _CST

ACPI specification sates that if P_LVL2_LAT > 100, then a system doesn't
support C2; if P_LVL3_LAT > 1000, then C3 is not supported.
But there are no such rules for Cx state data returned by _CST. If a
state is not supported it should not be included into the return
package. In other words, any latency value returned by _CST is valid,
it's up to the OS and/or user to decide whether to use it.

Submitted by: nork
Suggested by: mav
MFC after: 1 week


# a157e425 13-Sep-2010 Alexander Motin <mav@FreeBSD.org>

Refactor timer management code with priority to one-shot operation mode.
The main goal of this is to generate timer interrupts only when there is
some work to do. When CPU is busy interrupts are generating at full rate
of hz + stathz to fullfill scheduler and timekeeping requirements. But
when CPU is idle, only minimum set of interrupts (down to 8 interrupts per
second per CPU now), needed to handle scheduled callouts is executed.
This allows significantly increase idle CPU sleep time, increasing effect
of static power-saving technologies. Also it should reduce host CPU load
on virtualized systems, when guest system is idle.

There is set of tunables, also available as writable sysctls, allowing to
control wanted event timer subsystem behavior:
kern.eventtimer.timer - allows to choose event timer hardware to use.
On x86 there is up to 4 different kinds of timers. Depending on whether
chosen timer is per-CPU, behavior of other options slightly differs.
kern.eventtimer.periodic - allows to choose periodic and one-shot
operation mode. In periodic mode, current timer hardware taken as the only
source of time for time events. This mode is quite alike to previous kernel
behavior. One-shot mode instead uses currently selected time counter
hardware to schedule all needed events one by one and program timer to
generate interrupt exactly in specified time. Default value depends of
chosen timer capabilities, but one-shot mode is preferred, until other is
forced by user or hardware.
kern.eventtimer.singlemul - in periodic mode specifies how much times
higher timer frequency should be, to not strictly alias hardclock() and
statclock() events. Default values are 2 and 4, but could be reduced to 1
if extra interrupts are unwanted.
kern.eventtimer.idletick - makes each CPU to receive every timer interrupt
independently of whether they busy or not. By default this options is
disabled. If chosen timer is per-CPU and runs in periodic mode, this option
has no effect - all interrupts are generating.

As soon as this patch modifies cpu_idle() on some platforms, I have also
refactored one on x86. Now it makes use of MONITOR/MWAIT instrunctions
(if supported) under high sleep/wakeup rate, as fast alternative to other
methods. It allows SMP scheduler to wake up sleeping CPUs much faster
without using IPI, significantly increasing performance on some highly
task-switching loads.

Tested by: many (on i386, amd64, sparc64 and powerc)
H/W donated by: Gheorghe Ardelean
Sponsored by: iXsystems, Inc.


# 3d844edd 10-Sep-2010 Andriy Gapon <avg@FreeBSD.org>

bus_add_child: change type of order parameter to u_int

This reflects actual type used to store and compare child device orders.
Change is mostly done via a Coccinelle (soon to be devel/coccinelle)
semantic patch.
Verified by LINT+modules kernel builds.

Followup to: r212213
MFC after: 10 days


# b6bfb5a0 23-Jun-2010 John Baldwin <jhb@FreeBSD.org>

MFC 209213:
When updating individual CPU's lowest Cx state to use, never set it to a
state lower than the lowest one supported by the current CPU. This closes
some races with changes to the hw.acpi.cpu_cx_lowest sysctl while Cx
states for individual CPUs were changing (e.g. unplugging the AC adapter
of a laptop) that could result in panics.

Approved by: re (kib)


# 3a18e1b6 19-Jun-2010 Alexander Motin <mav@FreeBSD.org>

Oops! Add " / hz" missed in r209328. Assume interrupt rate hz/2, not 1/2.


# 7150e671 19-Jun-2010 Alexander Motin <mav@FreeBSD.org>

While we indeed can't precisely measure time spent in C1, we can consider
measured interval as upper bound. It should be more precise then just
assuming hz/2. For idle CPU it should be quite precise, for busy - not
worse then before.


# 42040ff0 15-Jun-2010 John Baldwin <jhb@FreeBSD.org>

When updating individual CPU's lowest Cx state to use, never set it to a
state lower than the lowest one supported by the current CPU. This closes
some races with changes to the hw.acpi.cpu_cx_lowest sysctl while Cx
states for individual CPUs were changing (e.g. unplugging the AC adapter
of a laptop) that could result in panics.

Submitted by: Giovanni Trematerra
Tested by: David Demelier demelier dot david of gmail
MFC after: 3 days


# 3aa6d94e 11-Jun-2010 John Baldwin <jhb@FreeBSD.org>

Update several places that iterate over CPUs to use CPU_FOREACH().


# 57ff35ce 11-Mar-2010 Andriy Gapon <avg@FreeBSD.org>

MFC r203776: acpi cpu: probe+attach before all other enumerated children

X-MFCto7 after: 1 week


# 0445c84e 28-Feb-2010 Andriy Gapon <avg@FreeBSD.org>

MFC r203546: acpi_cpu: prefer _OSC over _PDC


# 9478f399 28-Feb-2010 Andriy Gapon <avg@FreeBSD.org>

MFC r203430: acpi_cpu: correct capabilities arguments for Processor _OSC


# aa835160 11-Feb-2010 Andriy Gapon <avg@FreeBSD.org>

acpi cpu: probe+attach before all other enumerated children on acpi bus

Some current systems dynamically load SSDT(s) when _PDC/_OSC method
of Processor is evaluated. Other devices in ACPI namespace may access
objects defined in the dynamic SSDT. Drivers for such devices might
have to have a rather high priority, because of other dependencies.
Good example is acpi_ec driver for EC.
Thus we attach to Processors as early as possible to load the SSDTs
before any other drivers may try to evaluate control methods.
It also seems to be a natural order for a processor in a device
hierarchy.

On the other hand, some child devices on acpi cpu bus need to access
other system resources like PCI configuration space of chipset devices,
so they need to be probed and attached rather late.
For this reason we probe and attach the cpu bus at
SI_SUB_CONFIGURE:SI_ORDER_MIDDLE SYSINIT level.
In the future this could be done more elegantly via multipass.

Please note that acpi drivers that might access ACPI namespace from
device_identify will do that before _PDC/_OSC of Processors are evaluated.

Legacy cpu driver is not affected by this change.

PR: kern/142561 (in part)
Reviewed by: jhb
Silence from: acpi@
MFC after: 5 weeks


# f4ab0ccc 05-Feb-2010 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: prefer _OSC over _PDC, just in case

_PDC was deprecated in favor of _OSC long time ago, but it
seems that they still peacefully coexist and in some case
only _PDC is present.
Still _OSC provides a reacher interface and is capable to
report back its status.
If the status is non-zero, then report it, we may find
it useful to understand what firmware expects from OS.
Also clean up some comments that became less useful over time.

Reviewed by: njl, jhb, rpaulo
MFC after: 3 weeks


# e21bbd17 05-Feb-2010 Andriy Gapon <avg@FreeBSD.org>

MFC r197104,197105,197106,197107,197688,198237,199337,199338,200553,200554,
202771,202773: bring acpica version to 20100121

MFC details:
r197104 | jkim | 2009-09-12 01:48:53 +0300 (Sat, 12 Sep 2009) | 4 lines
MFV: r196804
Import ACPICA 20090903

r197105 | jkim | 2009-09-12 01:49:34 +0300 (Sat, 12 Sep 2009) | 2 lines
Catch up with ACPICA 20090903.

r197106 | jkim | 2009-09-12 01:50:15 +0300 (Sat, 12 Sep 2009) | 2 lines
Catch up with ACPICA 20090903.

r197107 | jkim | 2009-09-12 01:56:08 +0300 (Sat, 12 Sep 2009) | 2 lines
Canonify include paths for newly added files.

r197688 | jkim | 2009-10-01 23:56:15 +0300 (Thu, 01 Oct 2009) | 4 lines
Compile ACPI debugger and disassembler for kernel modules
unconditionally.
These files will generate almost empty object files without
ACPI_DEBUG/DDB
options. As a result, size of acpi.ko will increase slightly.

r198237 | jkim | 2009-10-19 19:12:58 +0300 (Mon, 19 Oct 2009) | 2 lines
Merge ACPICA 20091013.

r199337 | jkim | 2009-11-16 23:47:12 +0200 (Mon, 16 Nov 2009) | 2 lines
Merge ACPICA 20091112.

r199338 | jkim | 2009-11-16 23:53:56 +0200 (Mon, 16 Nov 2009) | 2 lines
Add a forgotten module Makefile change from the previous commit.

r200553 | jkim | 2009-12-15 00:24:04 +0200 (Tue, 15 Dec 2009) | 2 lines
Merge ACPICA 20091214.

r200554 | jkim | 2009-12-15 00:28:32 +0200 (Tue, 15 Dec 2009) | 3 lines
Remove _FDE quirk handling as these quirks are automatically repaired
by ACPICA layer since ACPICA 20091214.

r202771 | jkim | 2010-01-21 23:14:28 +0200 (Thu, 21 Jan 2010) | 2 lines
Merge ACPICA 20100121.

r202773 | jkim | 2010-01-21 23:31:39 +0200 (Thu, 21 Jan 2010) | 2 lines
Fix a new header inclusion.

Discussed with: jkim, jhb
No objections from: acpi@


# 877a6d99 03-Feb-2010 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: correct capabilities arguments for Processor _OSC evaluation

Populate capabilities buffer according to
Intel Processor Vendor-Specific ACPI Interface Specification.

MFC after: 2 weeks


# e9aa44c8 30-Nov-2009 Andriy Gapon <avg@FreeBSD.org>

MFC r199016: acpi: remove 'magic' ivar

Note that the ivar itself is kept in the stable branches, only its use is
dropped.


# f6eb382c 07-Nov-2009 Andriy Gapon <avg@FreeBSD.org>

acpi: remove 'magic' ivar

o acpi_hpet: auto-added 'wildcard' devices can be identified by
non-NULL handle attribute.
o acpi_ec: auto-add 'wildcard' devices can be identified by
unset (NULL) private attribute.
o acpi_cpu: use private instead of magic to store cpu id.

Reviewed by: jhb
Silence from: acpi@
MFC after: 2 weeks
X-MFC-Note: perhaps the ivar should stay for ABI stability


# 92488a57 11-Sep-2009 Jung-uk Kim <jkim@FreeBSD.org>

Catch up with ACPICA 20090903.


# 247db074 20-Aug-2009 John Baldwin <jhb@FreeBSD.org>

MFC 196403: Temporarily revert the new-bus locking for 8.0 release.

Approved by: re (kib)


# a56fe095 20-Aug-2009 John Baldwin <jhb@FreeBSD.org>

Temporarily revert the new-bus locking for 8.0 release. It will be
reintroduced after HEAD is reopened for commits by re@.

Approved by: re (kib), attilio


# 444b9186 02-Aug-2009 Attilio Rao <attilio@FreeBSD.org>

Make the newbus subsystem Giant free by adding the new newbus sxlock.
The newbus lock is responsible for protecting newbus internIal structures,
device states and devclass flags. It is necessary to hold it when all
such datas are accessed. For the other operations, softc locking should
ensure enough protection to avoid races.

Newbus lock is automatically held when virtual operations on the device
and bus are invoked when loading the driver or when the suspend/resume
take place. For other 'spourious' operations trying to access/modify
the newbus topology, newbus lock needs to be automatically acquired and
dropped.

For the moment Giant is also acquired in some key point (modules subsystem)
in order to avoid problems before the 8.0 release as module handlers could
make assumptions about it. This Giant locking should go just after
the release happens.

Please keep in mind that the public interface can be expanded in order
to provide more support, if there are really necessities at some point
and also some bugs could arise as long as the patch needs a bit of
further testing.

Bump __FreeBSD_version in order to reflect the newbus lock introduction.

Reviewed by: ed, hps, jhb, imp, mav, scottl
No answer by: ariff, thompsa, yongari
Tested by: pho,
G. Trematerra <giovanni dot trematerra at gmail dot com>,
Brandon Gooch <jamesbrandongooch at gmail dot com>
Sponsored by: Yahoo! Incorporated
Approved by: re (ksmith)


# 129d3046 05-Jun-2009 Jung-uk Kim <jkim@FreeBSD.org>

Import ACPICA 20090521.


# aaac7452 02-Jun-2009 Jung-uk Kim <jkim@FreeBSD.org>

Chase ACPICA API changes (for kernel and boot loader).


# 2500b6d9 03-May-2009 Alexander Motin <mav@FreeBSD.org>

Make dev.cpu.X.cx_usage sysctl also report current average of sleep time.


# bb1d6ad5 02-May-2009 Alexander Motin <mav@FreeBSD.org>

Remove unused variable and fix spelling in comment.


# b0baaaae 02-May-2009 Alexander Motin <mav@FreeBSD.org>

Avoid comparing negative signed to positive unsignad values. It was
leading to a bug, when C-state does not decrease on sleep shorter then
declared transition latency. Fixing this deprecates workaround for broken
C-states on some hardware.

By the way, change state selecting logic a bit. Instead of last sleep
time use short-time average of it. Global interrupts rate in system is a
quite random value, to corellate subsequent sleeps so directly.


# ea667177 26-Mar-2009 John Baldwin <jhb@FreeBSD.org>

Move the code to update cpu_cx_count out of acpi_cpu_generic_cx_probe() and
into acpi_cpu_startup() which is where all the other code to update this
global variable lives. This fixes a bug where cpu_cx_count was not updated
correctly if acpi_cpu_generic_cx_probe() returned early.

PR: kern/108581
Debugged by: Bruce Cran
Reviewed by: avg, njl, sepotvin
MFC after: 3 days


# f1e1ddc3 19-Feb-2009 Andriy Gapon <avg@FreeBSD.org>

acpi_cpu: fixup for PIIX4E PCI config related to C2

This is triggered only if BIOS configures ACPI_BITREG_BUS_MASTER_RLD
aka BRLD_EN_BM to 1.
Rationale:
1. we do not support C3 on PIIX4E
2. bus master activity need not break out of C2 state
3. because of CPU_QUIRK_NO_BM_CTRL quirk we may reset bus master
status which would result in immediate break out from C2

So if you have seen
cpu0: too many short sleeps, backing off to C1
with this chipset before you may want to try cx_lowest of C2 again.

Reviewed by: rpaulo (mentor), njl
Approved by: rpaulo (mentor)


# d7f03759 19-Oct-2008 Ulf Lilleengen <lulf@FreeBSD.org>

- Import the HEAD csup code which is the basis for the cvsmode work.


# 89ab2a7a 11-Apr-2008 Rui Paulo <rpaulo@FreeBSD.org>

Update the list of Cx states when ACPICA notifies us. Usually, this
notification is sent when the AC plug is plugged in/out.

This is required on some laptops, namely the MacBooks.

Silence on: freebsd-acpi


# 8a000aca 09-Mar-2008 Rui Paulo <rpaulo@FreeBSD.org>

Some PIIX4 chipsets need to be told to generate Stop Breaks by setting
the appropriate bit in the DEVACTB register.
This change allows the C2 state on those systems to work as expected.

Reviewed by: njl
Submitted by: Andriy Gapon <avg at icyb.net.ua>
MFC after: 1 week


# 6e1de64d 15-Feb-2008 Rui Paulo <rpaulo@FreeBSD.org>

Skip validation of the C3 state if we disabled C3 by software (i.e.,
via quirk).

Submitted by: Andriy Gapon <avg at icyb.net.ua>
Reviewed by: njl (mentor)
Approved by: njl (mentor)
Requested by: njl (mentor)
MFC after: 3 days


# 7a310721 12-Feb-2008 John Baldwin <jhb@FreeBSD.org>

Fix a typo when testing for the NO_C3 quirk.

MFC after: 3 days


# cc3c11f9 02-Nov-2007 Nate Lawson <njl@FreeBSD.org>

Fix a shutdown hang on some SMP systems. The previous logic was to IPI all
CPUs to make sure idle threads are evicted from the softc before returning
from acpi_cpu_shutdown(). However, this is unnecessary since stop_cpus()
handles this for itself and at this point it's possible that our IPI will be
blocked (interrupts disabled).

Thanks to: Glen Leeder <glen.leeder / nokia.com>
MFC after: 3 days


# c961faca 30-Aug-2007 Nate Lawson <njl@FreeBSD.org>

Evaluate _OSC on boot to indicate our OS capabilities to ACPI. This is
needed at least to convince the BIOS to give us access to CPU freq
control on MacBooks.

Submitted by: Rui Paulo <rpaulo / fnop.net>
Approved by: re
MFC after: 5 days


# 3331373c 02-Jun-2007 Nate Lawson <njl@FreeBSD.org>

Disable CPU idle states during suspend and reenable them during resume.
While in the suspend path, this means the idle thread will just return
immediately rather than trying to enter C1-n. This helps in the case where
the chipset is powered down before the rest of the system and reads from
the cpu sleep registers begin returning immediately, causing the logic that
catches bad C2/C3 behavior to kick in. Observed on my Panasonic Y4.

MFC after: 3 days


# b13cf774 02-Jun-2007 Nate Lawson <njl@FreeBSD.org>

Fix a bug introduced in the per-CPU Cx states commit. The wrong loop var
(j/i) was being used and it was being incremented, not decremented as before.
Factor out this code into a common function and call it from both the common
and per-CPU case.

MFC after: 1 day


# 2be4e471 22-Mar-2007 Jung-uk Kim <jkim@FreeBSD.org>

Catch up with ACPI-CA 20070320 import.


# 7826bf98 23-Jan-2007 Nate Lawson <njl@FreeBSD.org>

Add missing function trace for debug prints.


# bd826803 15-Jan-2007 Nate Lawson <njl@FreeBSD.org>

Clean up some debug prints from last commit and move one under boot -v.
Reminded by: bruno


# 30dd6af3 07-Jan-2007 Nate Lawson <njl@FreeBSD.org>

Fix LINT and ACPI_DEBUG builds and add print for use of flush cache inst.


# 907b6777 07-Jan-2007 Nate Lawson <njl@FreeBSD.org>

Re-work Cx handling to be per-cpu and asymmetrical, fixing support on
modern dual-core systems as well.

- Parse the _CST packages for each cpu and track all the states individually,
on a per-cpu basis.

- Revert to generic FADT/P_BLK based Cx control if the _CST package
is not present on all cpus. In that case, the new driver will
still support per-cpu Cx state handling. The driver will determine the
highest Cx level that can be supported by all the cpus and configure the
available Cx state based on that.

- Fixed the case where multiple cpus in the system share the same
registers for Cx state handling. To do that, added a new flag
parameter to the acpi_PkgGas and acpi_bus_alloc_gas functions that
enable the caller to add the RF_SHAREABLE flag. This flag could also be
useful to other callers (acpi_throttle?) in the tree but this change is
not yet made.

- For Core Duo cpus, both cores seems to be taken out of C3 state when
any one of the cores need to transition out. This broke the short sleep
detection logic. It is disabled now if there is more than one cpu in
the system for now as it fixed it in my case. This quirk may need to
be re-enabled later differently.

- Added support to control cx_lowest on a per-cpu basis. There is still
a generic cx_lowest to enable changing cx_lowest for all cpus with a single
sysctl and for ease of use. Sample output for the new sysctl:

dev.cpu.0.cx_supported: C1/1 C2/1 C3/57
dev.cpu.0.cx_lowest: C3
dev.cpu.0.cx_usage: 0.00% 43.16% 56.83%
dev.cpu.1.cx_supported: C1/1 C2/1 C3/57
dev.cpu.1.cx_lowest: C3
dev.cpu.1.cx_usage: 0.00% 45.65% 54.34%
hw.acpi.cpu.cx_lowest: C3

This work was done by Stephane E. Potvin with some simple reworking by
myself. Thank you.

Submitted by: Stephane E. Potvin <sepotvin / videotron.ca>
MFC after: 2 weeks


# 80f006a1 25-Oct-2005 Nate Lawson <njl@FreeBSD.org>

If we're trying to use C2/3 and reads from the register are returning
immediately, back off to the next higher Cx sleep state. Some machines
with a Via chipset report a valid C3 but a register read doesn't actually
halt the CPU. This would cause the machine to appear unresponsive as it
repeatedly called cpu_idle() which immediately returned. Causing interrupts
(i.e. by pressing the power button) would cause the system to make forward
progress, showing that it wasn't actually hung.

Also, enable interrupts a little earlier. We don't need them disabled
to calculate the delta time for the read.

Reported by: silby
MFC after: 2 weeks


# 2a191126 11-Sep-2005 David E. O'Brien <obrien@FreeBSD.org>

Canonize the include of acpi.h.


# f2d94257 10-Apr-2005 Nate Lawson <njl@FreeBSD.org>

Advertise that we can handle unified SMP control of processor power
states, idling, etc. This has been supported since the cpufreq import.


# bce92885 10-Apr-2005 Nate Lawson <njl@FreeBSD.org>

Fix support for _PDC by using the proper version/length format for the
buffer. Also, reference the Intel document where the _PDC values were
found. This now supports ACPI-assisted SpeedStep on my borrowed T42.


# b29224c2 04-Apr-2005 Nate Lawson <njl@FreeBSD.org>

Add the acpi_get_features() method. This method is called on child drivers
to see what features they may support before calling identify/probe/attach.
This is necessary because the ACPI 3.0 spec requires driver support be
advertised before running any methods. For now, the flags are as specified
in for the _PDC and _OSC methods but we can support private flags as needed.

Add an implementation of this for acpi_cpu. It checks all its children
(notably cpufreq drivers) and calls the _PDC method to report the results.


# 43ce1c77 26-Mar-2005 Nate Lawson <njl@FreeBSD.org>

If a device_add_child fails (i.e. low memory situation), be sure to free
the unused ivars also.

Submitted by: pjd
Obtained from: Coverity Prevent analysis


# 8b888c66 06-Feb-2005 Nate Lawson <njl@FreeBSD.org>

Remove handling _PSS notifies from acpi_cpu and let acpi_perf handle them.


# 8c5468e3 06-Feb-2005 Nate Lawson <njl@FreeBSD.org>

Remove acpi throttling support from the acpi_cpu(4) driver now that this
is supported by acpi_throttle(4).


# 3cc2f176 06-Feb-2005 Nate Lawson <njl@FreeBSD.org>

Notify the OS that we're taking over Px states in acpi_perf(4) instead of
doing it in the cpu driver. The previous code was incorrect anyway since
this value controls Px states, not throttling as the comment said. Since
we didn't support Px states before, there was no impact. Also, note that
we delay the write to SMI_CMD until after booting is complete since it
sometimes triggers a change in the frequency and we want to have all
drivers ready to detect/handle this.


# 3045c8af 06-Feb-2005 Nate Lawson <njl@FreeBSD.org>

Staticize the legacy cpu devclasses and revert the name for the acpi_cpu
devclass. As pointed out by dfr@, devclasses don't have to share the same
linkage if multiple drivers have the same name. Newbus should match the
devclasses based on name and allocate non-conflicting unit numbers.


# f4eb0418 05-Feb-2005 Nate Lawson <njl@FreeBSD.org>

Convert to the new GAS API so that we can free registers in the future.


# 7d3a0620 04-Feb-2005 Nate Lawson <njl@FreeBSD.org>

Make the devclass static for now until deciding whether to share them.


# 98aa9cd0 03-Feb-2005 Nate Lawson <njl@FreeBSD.org>

Update the CPU attachments to return CPU_IVAR_PCPU as well as pass on
appropriate requests to any children.


# ae56b59f 16-Nov-2004 Nate Lawson <njl@FreeBSD.org>

Enable throttling/C3 quirks for PIIX4 parts. Defer checking quirks until
after boot so that PCI is initialized and we can probe for the problem
chipsets. Note that while probed but unusable states are disabled, they
aren't freed yet. In the future, it may make sense to detach them.

Tested by: Adam K Kirchoff <adamk at voicenet com>
MFC after: 2 days


# f435261d 11-Oct-2004 Nate Lawson <njl@FreeBSD.org>

Update C3 support when BM control is not present.

* Fix a bug where caches were flushed on non-C3 transitions.
* Be sure a working flush cache instruction is present before using it.
* Disable C3 completely if it isn't present.


# 18ececa0 11-Oct-2004 Nate Lawson <njl@FreeBSD.org>

If bus mastering control is not available (PM2_BLK), don't just disable
C3. Instead, flush caches before entering C3. This may be slower but
provides good power savings.


# 31ad3b88 10-Oct-2004 Nate Lawson <njl@FreeBSD.org>

Move the code for halting the CPU (acpi_cpu_c1) into machdep files.
This removes the last MD portion of acpi_cpu.c.

MFC after: 2 weeks


# d92a2ebd 13-Aug-2004 Nate Lawson <njl@FreeBSD.org>

MPSAFE locking

* Hold the ACPI lock over table register writes.
* Serialize calls to acpi_cpu_throttle_set() and the sysctls.


# 4a03551d 23-Jun-2004 Nate Lawson <njl@FreeBSD.org>

Use uintmax_t for CPU statistics and add a cast to prevent truncation of
the statistics in a multiply.

Pointed out by: YONETANI Tomokazu


# 3e7fa136 18-Jun-2004 Nate Lawson <njl@FreeBSD.org>

Add more precision to the cx_usage sysctl output and special-case 0%.

Submitted by: YONETANI Tomokazu <qhwt+freebsd-acpi AT les.ath.cx>


# a2afe45a 05-Jun-2004 Nate Lawson <njl@FreeBSD.org>

Rework acpi_cpu_idle() to select the next idle state before sleeping, not
after. Unify the paths for all Cx states. Remove cpu_idle_busy and
instead do the little profiling we need before re-enabling interrupts.
Use 1 quantum as estimate for C1 sleep duration since the timer interrupt
is the main reason we wake.

While here, change the cx_history sysctl to cx_usage and report statistics
for which idle states were used in terms of percent. This seems more
intuitive than counters. Remove the cx_stats structure since it's no
longer used. Update the man page.

Change various types which do not need explicit size.


# f2b69543 04-Jun-2004 Peter Wemm <peter@FreeBSD.org>

Work around the preemption problem in acpi_cpu.c for shutting down.

Submitted by: nate / jhb


# fe12f24b 30-May-2004 Poul-Henning Kamp <phk@FreeBSD.org>

Add missing <sys/module.h> includes


# ccc09458 06-May-2004 Nate Lawson <njl@FreeBSD.org>

Change hw.acpi.cpu.cx_lowest to accept values in the form of C1,
C2, ... Update power_profile to use the new format. Update the
man page to reflect this and give more info on Cx states.


# b0e2b625 06-May-2004 Nate Lawson <njl@FreeBSD.org>

Rename acpi_cpu to cpu. Change the probe routine to early on reject
devices it cannot attach to. This gets rid of extraneous but harmless
device_probe_and_attach() errors. While I'm here, make the device
description more useful. The !acpi case for cpu is handled by legacy0.


# eea17c34 20-Apr-2004 Nate Lawson <njl@FreeBSD.org>

Move the timer difference convenience function from acpi_cpu.c to make it
globally available. acpi_TimerDelta() subtracts two readings from the
ACPI PM timer and returns the difference. It properly distinguishes between
24-bit and 32-bit timers and handles wraparound.


# 64278df5 09-Apr-2004 Nate Lawson <njl@FreeBSD.org>

Add MODULE_DEPEND entries so some of these drivers can eventually be
loaded separately from ACPI (i.e., embedded use).


# e5ada020 17-Mar-2004 Nate Lawson <njl@FreeBSD.org>

Fix border error to allow systems that specify 100 for latency also use
C2 and 1000 to use C3.

Submitted by: Bruno Ducrot <ducrot@poupinou.org>
Tested by: Scott Lambert <lambert@lambertfam.org>


# 08b994c0 05-Mar-2004 Nate Lawson <njl@FreeBSD.org>

Document a sysctl.

Submitted by: Craig Rodrigues <rodrigc@crodrigues.org>


# c181b89b 03-Mar-2004 Nate Lawson <njl@FreeBSD.org>

Don't disable Cx support and throttling on machines with a P_BLK_LEN != 6
even though the spec mandates this. Some have a value of 5 to indicate
throttling + C2 and some have 7 to indicate an extra C3 state. Support
throttling if the value is >= 4, C2 for >= 5, and C3 for >= 6.


# 50169793 28-Dec-2003 Nate Lawson <njl@FreeBSD.org>

Don't attach throttling if the P_BLK is 0, even if the P_BLK_LEN is 6.
This is more strict but no known systems have this problem.


# 21cea91f 23-Dec-2003 Nate Lawson <njl@FreeBSD.org>

Remove the device_t parameter from package routines that only used it to
print an error message. Update all callers of the package routines.


# 0d8fb61a 17-Dec-2003 Nate Lawson <njl@FreeBSD.org>

Remove power profile support from acpi_cpu, it will be managed by a
script run from devd(8).


# b279c35a 12-Dec-2003 Nate Lawson <njl@FreeBSD.org>

Fix throttling to use the proper mask. The bug resulted in only two
throttling values being available regardless of the CPU's capabilities.
This has been broken since rev 1.1. Also clarify a comment.

Submitted by: Taku YAMAMATO <taku@cent.saitama-u.ac.jp>


# 4d5f2cbb 10-Dec-2003 John Baldwin <jhb@FreeBSD.org>

Trim trailing whitespace.


# 73a34dd4 08-Dec-2003 Nate Lawson <njl@FreeBSD.org>

We don't need to call _INI on processor objects now that ACPI-CA does
this as it should.


# 447a5fa1 03-Dec-2003 John Baldwin <jhb@FreeBSD.org>

Update this driver to be more module friendly:
- Dynamically allocate the cpu_softc[] array based on mp_maxid instead of
using a statically sized array that depended on 'options SMP'.
- Use mp_maxid rather than MAXCPU when walking all the CPUs looking for a
match.
- Always call smp_rendezvous() since UP kernels now provide this.
- Use mp_ncpus rather than cpu_ndevices when determining if we need to
disable C3 for SMP machines.

Approved by: re (rwatson)
Reviewed by: njl


# cd1f3db9 27-Nov-2003 Nate Lawson <njl@FreeBSD.org>

* If a processor's softc is NULL, use C1 since there is no ACPI
processor object for this CPU. This occurs for logical CPUs which
do not have an associated processor object (e.g., HTT).

Approved by: re (rwatson)


# b6426963 26-Nov-2003 Nate Lawson <njl@FreeBSD.org>

* Add acpi_pcpu_get_id(idx, *acpi_id, *cpu_id) which fetches the
idx'th present CPU with pc_acpi_id equal to *acpi_id. If *acpi_id
does not match that processor's pc_acpi_id, return the value for
ProcId derived from the MADT in *acpi_id. If pc_acpi_id is 0xffffffff,
always override it with the value of *acpi_id. Finally, return
pc_cpuid in *cpu_id and use that as our primary key.

* Use pc_cpuid as our unique key because we know it is valid since
MD code set it. The values for ProcId in the ASL and MADT don't
match up on some machines (!), forcing us to fall back to ordered
probing in that case.

* Remove some #ifdef SMP since the refcount doesn't hurt performance
and will be needed for dynamic _CST objects. Only one #ifdef SMP
(for smp_rendezvous) remains.

* Hook up SMP in the compile flags in the Makefile.

Tested by: marcel, truckman
Approved by: re (scottl)


# 56a70ead 19-Nov-2003 Nate Lawson <njl@FreeBSD.org>

* Add a DEVMETHOD for acpi so that child detach methods get called. Add
an acpi_cpu method for shutdown that disables entry to acpi_cpu_idle
and then IPIs/waits for threads to exit. This fixes a panic late in
reboot in the SMP case.

* In the !SMP case, don't use the processor id filled out by the MADT
since there can only be one processor. This was causing a panic in
acpi_cpu_idle if the id was 1 since the data was being dereferenced from
cpu_softc[1] even though the actual data was in cpu_softc[0] (which is
correct).

* Rework the initialization functions so that cpu_idle_hook is written
late in the boot process.

* Make the P_BLK, P_BLK_LEN, and cpu_cx_count all softc-local variables.
This will help SMP boxes that have _CST or multiple P_BLKs. No such
boxes are known at this time.

* Always allocate the C1 state, even if the P_BLK is invalid. This means
we will always take over idling if enabled. Remove the value -1 as
valid for cx_lowest since this is redundant with machdep.cpu_idle_hlt.

* Reduce locking for the throttle initialization case to around the write
to the smi_cmd port. Add disabled code to write the CST_CNT. It will
be enabled once _CST re-evaluation is tested (post 5.2R).

Thank you: dfr, imp, jhb, marcel, peter
Tested by: rwatson, Harald Schmalzbauer <h@schmalzbauer.de>
Approved by: re (rwatson)


# 6b74f9b7 15-Nov-2003 Nate Lawson <njl@FreeBSD.org>

Implement Cx CPU idle states and updated throttling support.

* Use the cpu_idle_hook() to do idling for C1-C3.
* Use both _CST and the FADT to detect Cx states.
* Use both _PTC and P_CNT for controlling throttling.
* Add a notify handler to detect changes in _CST and _PSS
* Call the _INI function for each processor if present. This will be
done by ACPI-CA in the future.
* Fix a bug on SMP systems where CPUs will attach multiple times if the
bus is rescan.
* Document new sysctls for controlling idling.


# be2b1797 28-Aug-2003 Nate Lawson <njl@FreeBSD.org>

Style and whitespace changes. Also, make the ivar functions non-inline
since inlining failed due to the size of BUS_*


# aad970f1 24-Aug-2003 David E. O'Brien <obrien@FreeBSD.org>

Use __FBSDID().
Also some minor style cleanups.


# a40f20c7 23-Jan-2003 Nate Lawson <njl@FreeBSD.org>

More useful announce message containing current speed of CPU


# d6b992c7 14-Jan-2003 Nate Lawson <njl@FreeBSD.org>

For the cpu throttling message, s/enabled/available

Requested by: many


# fc0ea94a 16-Oct-2002 John Baldwin <jhb@FreeBSD.org>

Catch up to changes in acpivar.h to add support for using ACPI on
4-stable systems.

Sponsored by: The Weather Channel


# b4a05238 19-May-2002 Peter Wemm <peter@FreeBSD.org>

Brutally deal with __func__ being 'const char *' on gcc-3.1.


# 899ccf54 04-Mar-2002 Mitsuru IWASAKI <iwasaki@FreeBSD.org>

Add generalized power profile code.
This makes other power-management system (APM for now) to be able to
generate power profile change events (ie. AC-line status changes), and
other kernel components, not only the ACPI components, can be notified
the events.

- move subroutines in acpi_powerprofile.c (removed) to kern/subr_power.c
- call power_profile_set_state() also from APM driver when AC-line
status changes
- add call-back function for Crusoe LongRun controlling on power
profile changes for a example


# 9127281c 22-Feb-2002 Mike Smith <msmith@FreeBSD.org>

Match namespace cleanup changes in ACPI CA 20020217 update.
Use ACPI_SUCCESS/ACPI_FAILURE consistently.


# 3273b005 07-Jan-2002 Mike Smith <msmith@FreeBSD.org>

Staticise devclasses and some unnecessarily global variables.


# 3e759f36 02-Jan-2002 Mike Smith <msmith@FreeBSD.org>

If the CLK_VAL register is 0 bits wide, the system does not support
CPU throttling, so don't do some bogus math to check it.


# 6971b3c7 18-Nov-2001 Mitsuru IWASAKI <iwasaki@FreeBSD.org>

Cleanups of verbose printing. All the messages for the debugging is
disabled unless verbose flag is set. Also fix some messages in terms
of English.
The critical messages and error messages in probe/attach routine are
unchanged by this commit.


# e5e5b51f 23-Oct-2001 John Baldwin <jhb@FreeBSD.org>

Allow hw.acpi.cpu.{economy,performance}_speed to be set from the loader
via tunables.


# f48bf2d7 29-Aug-2001 Mike Smith <msmith@FreeBSD.org>

Add missing acpi_disabled() call so that this driver can be disabled.


# 4c1cdee6 26-Aug-2001 Mike Smith <msmith@FreeBSD.org>

Updates to match the ACPI CA 20010816 import:

- New debug macro (ACPI_DEBUG_PRINT), reducing debug-case code size.
- New debug level/subsystem codes.


# bfae45aa 21-Jul-2001 Mike Smith <msmith@FreeBSD.org>

Convert from acpi_strerror() to AcpiFormatException()

Fix dangling include of the dear departed acpi_ecreg.h


# ad5dc75b 20-Jul-2001 Mike Smith <msmith@FreeBSD.org>

Use our saved copy of the FADT rather than fetching it again.


# f0987736 07-Jul-2001 Mitsuru IWASAKI <iwasaki@FreeBSD.org>

Fix typo in acpi_cpu_attach() and correct range checking in
acpi_cpu_speed_sysctl().


# fec754d4 07-Jul-2001 Mike Smith <msmith@FreeBSD.org>

Kill the old processor driver; the ACPI CA functions it depended on
are not coming back any time soon. Implement a new 'acpi_cpu' driver
with support for CPU throttling and power policies.