History log of /linux-master/arch/x86/kernel/cpu/intel_epb.c
Revision Date Author Comments
# d87c4937 21-Nov-2023 James Morse <james.morse@arm.com>

x86: intel_epb: Don't rely on link order

intel_epb_init() is called as a subsys_initcall() to register cpuhp
callbacks. The callbacks make use of get_cpu_device() which will return
NULL unless register_cpu() has been called. register_cpu() is called
from topology_init(), which is also a subsys_initcall().

This is fragile. Moving the register_cpu() to a different
subsys_initcall() leads to a NULL dereference during boot.

Make intel_epb_init() a late_initcall(), user-space can't provide a
policy before this point anyway.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Signed-off-by: "Russell King (Oracle)" <rmk+kernel@armlinux.org.uk>
Acked-by: "Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/E1r5R2m-00Csyb-2S@rmk-PC.armlinux.org.uk
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 5bfa0e45 24-Nov-2023 James Morse <james.morse@arm.com>

x86/cpu/intel_epb: Don't rely on link order

intel_epb_init() is called as a subsys_initcall() to register cpuhp
callbacks. The callbacks make use of get_cpu_device() which will return
NULL unless register_cpu() has been called. register_cpu() is called
from topology_init(), which is also a subsys_initcall().

This is fragile. Moving the register_cpu() to a different
subsys_initcall() leads to a NULL dereference during boot.

Make intel_epb_init() a late_initcall(), user-space can't provide a
policy before this point anyway.

Signed-off-by: James Morse <james.morse@arm.com>
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


# 882cdb06 07-Aug-2023 Peter Zijlstra <peterz@infradead.org>

x86/cpu: Fix Gracemont uarch

Alderlake N is an E-core only product using Gracemont
micro-architecture. It fits the pre-existing naming scheme perfectly
fine, adhere to it.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230807150405.686834933@infradead.org


# 7420ae3b 27-Oct-2022 Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

x86/intel_epb: Set Alder Lake N and Raptor Lake P normal EPB

Intel processors support additional software hint called EPB ("Energy
Performance Bias") to guide the hardware heuristic of power management
features to favor increasing dynamic performance or conserve energy
consumption.

Since this EPB hint is processor specific, the same value of hint can
result in different behavior across generations of processors.

commit 4ecc933b7d1f ("x86: intel_epb: Allow model specific normal EPB
value")' introduced capability to update the default power up EPB
based on the CPU model and updated the default EPB to 7 for Alder Lake
mobile CPUs.

The same change is required for other Alder Lake-N and Raptor Lake-P
mobile CPUs as the current default of 6 results in higher uncore power
consumption. This increase in power is related to memory clock
frequency setting based on the EPB value.

Depending on the EPB the minimum memory frequency is set by the
firmware. At EPB = 7, the minimum memory frequency is 1/4th compared to
EPB = 6. This results in significant power saving for idle and
semi-idle workload on a Chrome platform.

For example Change in power and performance from EPB change from 6 to 7
on Alder Lake-N:

Workload Performance diff (%) power diff
----------------------------------------------------
VP9 FHD30 0 (FPS) -218 mw
Google meet 0 (FPS) -385 mw

This 200+ mw power saving is very significant for mobile platform for
battery life and thermal reasons.

But as the workload demands more memory bandwidth, the memory frequency
will be increased very fast. There is no power savings for such busy
workloads.

For example:

Workload Performance diff (%) from EPB 6 to 7
-------------------------------------------------------
Speedometer 2.0 -0.8
WebGL Aquarium 10K
Fish -0.5
Unity 3D 2018 0.2
WebXPRT3 -0.5

There are run to run variations for performance scores for
such busy workloads. So the difference is not significant.

Add a new define ENERGY_PERF_BIAS_NORMAL_POWERSAVE for EPB 7
and use it for Alder Lake-N and Raptor Lake-P mobile CPUs.

This modification is done originally by
Jeremy Compostella <jeremy.compostella@intel.com>.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Link: https://lore.kernel.org/all/20221027220056.1534264-1-srinivas.pandruvada%40linux.intel.com


# 4ecc933b 11-Dec-2021 Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>

x86: intel_epb: Allow model specific normal EPB value

The current EPB "normal" is defined as 6 and set whenever power-up EPB
value is 0. This setting resulted in the desired out of box power and
performance for several CPU generations. But this value is not suitable
for AlderLake mobile CPUs, as this resulted in higher uncore power.
Since EPB is model specific, this is not unreasonable to have different
behavior.

Allow a capability where "normal" EPB can be redefined. For AlderLake
mobile CPUs this desired normal value is 7.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>


# be1fcde6 26-May-2019 Rafael J. Wysocki <rafael.j.wysocki@intel.com>

x86: intel_epb: Do not build when CONFIG_PM is unset

Commit 9ed0985332a6 ("x86: intel_epb: Take CONFIG_PM into account")
prevented the majority of the Performance and Energy Bias Hint (EPB)
handling code from being built when CONFIG_PM is unset to fix a
regression introduced by commit b9c273babce7 ("PM / arch: x86:
MSR_IA32_ENERGY_PERF_BIAS sysfs interface").

In hindsight, however, it would be better to skip all of the EPB
handling code for CONFIG_PM unset as there really is no reason for
it to be there in that case. Namely, if the EPB is not touched
by the kernel at all with CONFIG_PM unset, there is no need to
worry about modifying the EPB inadvertently on CPU online and since
the system will not suspend or hibernate then, there is no need to
worry about possible modifications of the EPB by the platform
firmware during system-wide PM transitions.

For this reason, revert the changes made by commit 9ed0985332a6
and only allow intel_epb.o to be built when CONFIG_PM is set.

Note that this changes the behavior of the kernels built with
CONFIG_PM unset as they will not modify the EPB on boot if it is
zero initially any more, so it is not a fix strictly speaking, but
users building their kernels with CONFIG_PM unset really should not
expect them to take energy efficiency into account. Moreover, if
CONFIG_PM is unset for performance reasons, leaving EPB as set
initially by the platform firmware will actually be consistent
with the user's expectations.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>


# 9ed09853 09-May-2019 Rafael J. Wysocki <rafael.j.wysocki@intel.com>

x86: intel_epb: Take CONFIG_PM into account

Commit b9c273babce7 ("PM / arch: x86: MSR_IA32_ENERGY_PERF_BIAS sysfs
interface") caused kernels built with CONFIG_PM unset to crash on
systems supporting the Performance and Energy Bias Hint (EPB),
because it attempts to add files to sysfs directories that don't
exist on those systems.

Prevent that from happening by taking CONFIG_PM into account so
that the code depending on it is not compiled at all when it is
not set.

Fixes: b9c273babce7 ("PM / arch: x86: MSR_IA32_ENERGY_PERF_BIAS sysfs interface")
Reported-by: Ido Schimmel <idosch@mellanox.com>
Tested-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ingo Molnar <mingo@kernel.org>


# b9c273ba 21-Mar-2019 Rafael J. Wysocki <rafael.j.wysocki@intel.com>

PM / arch: x86: MSR_IA32_ENERGY_PERF_BIAS sysfs interface

The Performance and Energy Bias Hint (EPB) is expected to be set by
user space through the generic MSR interface, but that interface is
not particularly nice and there are security concerns regarding it,
so it is not always available.

For this reason, add a sysfs interface for reading and updating the
EPB, in the form of a new attribute, energy_perf_bias, located
under /sys/devices/system/cpu/cpu#/power/ for online CPUs that
support the EPB feature.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Borislav Petkov <bp@suse.de>


# 5861381d 21-Mar-2019 Rafael J. Wysocki <rafael.j.wysocki@intel.com>

PM / arch: x86: Rework the MSR_IA32_ENERGY_PERF_BIAS handling

The current handling of MSR_IA32_ENERGY_PERF_BIAS in the kernel is
problematic, because it may cause changes made by user space to that
MSR (with the help of the x86_energy_perf_policy tool, for example)
to be lost every time a CPU goes offline and then back online as well
as during system-wide power management transitions into sleep states
and back into the working state.

The first problem is that if the current EPB value for a CPU going
online is 0 ('performance'), the kernel will change it to 6 ('normal')
regardless of whether or not this is the first bring-up of that CPU.
That also happens during system-wide resume from sleep states
(including, but not limited to, hibernation). However, the EPB may
have been adjusted by user space this way and the kernel should not
blindly override that setting.

The second problem is that if the platform firmware resets the EPB
values for any CPUs during system-wide resume from a sleep state,
the kernel will not restore their previous EPB values that may
have been set by user space before the preceding system-wide
suspend transition. Again, that behavior may at least be confusing
from the user space perspective.

In order to address these issues, rework the handling of
MSR_IA32_ENERGY_PERF_BIAS so that the EPB value is saved on CPU
offline and restored on CPU online as well as (for the boot CPU)
during the syscore stages of system-wide suspend and resume
transitions, respectively.

However, retain the policy by which the EPB is set to 6 ('normal')
on the first bring-up of each CPU if its initial value is 0, based
on the observation that 0 may mean 'not initialized' just as well as
'performance' in that case.

While at it, move the MSR_IA32_ENERGY_PERF_BIAS handling code into
a separate file and document it in Documentation/admin-guide.

Fixes: abe48b108247 (x86, intel, power: Initialize MSR_IA32_ENERGY_PERF_BIAS)
Fixes: b51ef52df71c (x86/cpu: Restore MSR_IA32_ENERGY_PERF_BIAS after resume)
Reported-by: Thomas Renninger <trenn@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Acked-by: Borislav Petkov <bp@suse.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>