Cross Reference: /linux-master/drivers/cpufreq/cpufreq.c

History log of /linux-master/drivers/cpufreq/cpufreq.c
Revision	Date	Author	Comments
# f37a4d6b	12-Mar-2024	Sibi Sankar <quic_sibis@quicinc.com>	cpufreq: Fix per-policy boost behavior on SoCs using cpufreq_boost_set_sw() In the existing code, per-policy flags don't have any impact i.e. if cpufreq_driver boost is enabled and boost is disabled for one or more of the policies, the cpufreq driver will behave as if boost is enabled. Fix this by incorporating per-policy boost flag in the policy->max computation used in cpufreq_frequency_table_cpuinfo and setting the default per-policy boost to mirror the cpufreq_driver boost flag. Fixes: 218a06a79d9a ("cpufreq: Support per-policy performance boost") Reported-by: Dietmar Eggemann <dietmar.eggemann@arm.com> Reviewed-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Dhruva Gole <d-gole@ti.com> Signed-off-by: Sibi Sankar <quic_sibis@quicinc.com> Tested-by:Yipeng Zou <zouyipeng@huawei.com> <mailto:zouyipeng@huawei.com> Reviewed-by: Yipeng Zou <zouyipeng@huawei.com> <mailto:zouyipeng@huawei.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# c4d61a52	29-Feb-2024	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Don't unregister cpufreq cooling on CPU hotplug Offlining a CPU and bringing it back online is a common operation and it happens frequently during system suspend/resume, where the non-boot CPUs are hotplugged out during suspend and brought back at resume. The cpufreq core already tries to make this path as fast as possible as the changes are only temporary in nature and full cleanup of resources isn't required in this case. For example the drivers can implement online()/offline() callbacks to avoid a lot of tear down of resources. On similar lines, there is no need to unregister the cpufreq cooling device during suspend / resume, but only while the policy is getting removed. Moreover, unregistering the cpufreq cooling device is resulting in an unwanted outcome, where the system suspend is eventually aborted in the process. Currently, during system suspend the cpufreq core unregisters the cooling device, which in turn removes a kobject using device_del() and that generates a notification to the userspace via uevent broadcast. This causes system suspend to abort in some setups. This was also earlier reported (indirectly) by Roman [1]. Maybe there is another way around to fixing that problem properly, but this change makes sense anyways. Move the registering and unregistering of the cooling device to policy creation and removal times onlyy. Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218521 Reported-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com> Reported-by: Roman Stratiienko <r.stratiienko@gmail.com> Link: https://patchwork.kernel.org/project/linux-pm/patch/20220710164026.541466-1-r.stratiienko@gmail.com/ [1] Tested-by: Manaf Meethalavalappu Pallikunhi <quic_manafm@quicinc.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: Dhruva Gole <d-gole@ti.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# a755d0e2	27-Feb-2024	Qais Yousef <qyousef@layalina.io>	cpufreq: Honour transition_latency over transition_delay_us Some platforms like Arm's Juno can have a high transition latency that can be larger than the 2ms cap introduced. If a driver reports a transition_latency that is higher than the cap, then use it as-is. Update comment s/10/2/ to reflect the new cap of 2ms. Reported-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Qais Yousef <qyousef@layalina.io> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# e13aa799	04-Feb-2024	Qais Yousef <qyousef@layalina.io>	cpufreq: Change default transition delay to 2ms 10ms is too high for today's hardware, even low end ones. This default end up being used a lot on Arm machines at least. Pine64, mac mini and pixel 6 all end up with 10ms rate_limit_us when using schedutil, and it's too high for all of them. Change the default to 2ms which should be 'pessimistic' enough for worst case scenario, but not too high for platforms with fast DVFS hardware. Signed-off-by: Qais Yousef <qyousef@layalina.io> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 599457ba	11-Dec-2023	Vincent Guittot <vincent.guittot@linaro.org>	cpufreq: Use the fixed and coherent frequency for scaling capacity cpuinfo.max_freq can change at runtime because of boost as an example. This implies that the value could be different from the frequency that has been used to compute the capacity of a CPU. The new arch_scale_freq_ref() returns a fixed and coherent frequency that can be used to compute the capacity for a given frequency. [ Also fix a arch_set_freq_scale() newline style wart in <linux/cpufreq.h>. ] Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Lukasz Luba <lukasz.luba@arm.com> Reviewed-by: Lukasz Luba <lukasz.luba@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20231211104855.558096-3-vincent.guittot@linaro.org
# e7a1b32e	05-Oct-2023	Pierre Gondois <pierre.gondois@arm.com>	cpufreq: Rebuild sched-domains when removing cpufreq driver The Energy Aware Scheduler (EAS) relies on the schedutil governor. When moving to/from the schedutil governor, sched domains must be rebuilt to allow re-evaluating the enablement conditions of EAS. This is done through sched_cpufreq_governor_change(). Having a cpufreq governor assumes a cpufreq driver is running. Inserting/removing a cpufreq driver should trigger a re-evaluation of EAS enablement conditions, avoiding to see EAS enabled when removing a running cpufreq driver. Rebuild the sched domains in schedutil's sugov_init()/sugov_exit(), allowing to check EAS's enablement condition whenever schedutil governor is initialized/exited from. Move relevant code up in schedutil.c to avoid a split and conditional function declaration. Rename sched_cpufreq_governor_change() to sugov_eas_rebuild_sd(). Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 0faf84ca	12-Sep-2023	Justin Stitt <justinstitt@google.com>	cpufreq: Replace deprecated strncpy() with strscpy() `strncpy` is deprecated for use on NUL-terminated destination strings [1]. We should prefer more robust and less ambiguous string interfaces. Both `policy->last_governor` and `default_governor` are expected to be NUL-terminated which is shown by their heavy usage with other string apis like `strcmp`. A suitable replacement is `strscpy` [2] due to the fact that it guarantees NUL-termination on the destination buffer. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Link: https://lore.kernel.org/r/20230913-strncpy-drivers-cpufreq-cpufreq-c-v1-1-f1608bfeff63@google.com Signed-off-by: Kees Cook <keescook@chromium.org>
# 218a06a7	22-Aug-2023	Jie Zhan <zhanjie9@hisilicon.com>	cpufreq: Support per-policy performance boost The boost control currently applies to the whole system. However, users may prefer to boost a subset of cores in order to provide prioritized performance to workloads running on the boosted cores. Enable per-policy boost by adding a 'boost' sysfs interface under each policy path. This can be found at: /sys/devices/system/cpu/cpufreq/policy<*>/boost Same to the global boost switch, writing 1/0 to the per-policy 'boost' enables/disables boost on a cpufreq policy respectively. The user view of global and per-policy boost controls should be: 1. Enabling global boost initially enables boost on all policies, and per-policy boost can then be enabled or disabled individually, given that the platform does support so. 2. Disabling global boost makes the per-policy boost interface illegal. Signed-off-by: Jie Zhan <zhanjie9@hisilicon.com> Reviewed-by: Wei Xu <xuwei5@hisilicon.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 61bfbf79	29-Aug-2023	Liao Chang <liaochang1@huawei.com>	cpufreq: Fix the race condition while updating the transition_task of policy The field 'transition_task' of policy structure is used to track the task which is performing the frequency transition. Using this field to print a warning once detect a case where the same task is calling _begin() again before completing the preivous frequency transition via the _end(). However, there is a potential race condition in _end() and _begin() APIs while updating the field 'transition_task' of policy, the scenario is depicted below: Task A Task B /* 1st freq transition / Invoke _begin() { ... ... } / 2nd freq transition */ Invoke _begin() { ... //waiting for A to ... //clear ... //transition_ongoing ... //in _end() for ... //the 1st transition \| Change the frequency \| \| Invoke _end() { \| ... \| ... \| transition_ongoing = false; V transition_ongoing = true; transition_task = current; transition_task = NULL; ... //A overwrites the task ... //performing the transition ... //result in error warning. } To fix this race condition, the transition_lock of policy structure is now acquired before updating policy structure in _end() API. Which ensure that only one task can update the 'transition_task' field at a time. Link: https://lore.kernel.org/all/b3c61d8a-d52d-3136-fbf0-d1de9f1ba411@huawei.com/ Fixes: ca654dc3a93d ("cpufreq: Catch double invocations of cpufreq_freq_transition_begin/end") Signed-off-by: Liao Chang <liaochang1@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 1f464cb4	29-Aug-2023	Liao Chang <liaochang1@huawei.com>	cpufreq: Avoid printing kernel addresses in cpufreq_resume() The pointer value of policy and driver structure are currently printed in the error messages of cpufreq_resume(), this is not recommended and helpful. In order to be consistent with the error message in cpufreq_suspend() and easier to understand, print the name of driver strcture and the manage CPU of policy structure individually in the error messages of cpufreq_resume(). Link: https://lore.kernel.org/all/b7be717c-41d8-bbbf-3e97-3799948ab757@huawei.com Signed-off-by: Liao Chang <liaochang1@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# ba6ea77d	14-Aug-2023	Liao Chang <liaochang1@huawei.com>	cpufreq: Prefer to print cpuid in MIN/MAX QoS register error message When a cpufreq_policy is allocated, the cpus, related_cpus and real_cpus of policy are still unset. Therefore, it is preferable to print the passed 'cpu' parameter instead of a empty 'cpus' cpumask in error message when registering MIN/MAX QoS notifier fails. Signed-off-by: Liao Chang <liaochang1@huawei.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
# b4a11fa3	29-May-2023	Wyes Karny <wyes.karny@amd.com>	cpufreq: Fail driver register if it has adjust_perf without fast_switch If fast_switch_possible flag is set by the scaling driver, the governor is free to select fast_switch function even if adjust_perf is set. Some scaling drivers which use adjust_perf don't set fast_switch thinking that the governor would never fall back to fast_switch. But the governor can fall back to fast_switch even in runtime if frequency invariance is disabled due to some reason. This could crash the kernel if the driver didn't set the fast_switch function pointer. Therefore, fail driver registration if it has adjust_perf without fast_switch. Suggested-by: Rafael J. Wysocki <rafael@kernel.org> Suggested-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Wyes Karny <wyes.karny@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 2744a63c	13-Mar-2023	Greg Kroah-Hartman <gregkh@linuxfoundation.org>	cpufreq: move to use bus_get_dev_root() Direct access to the struct bus_type dev_root pointer is going away soon so replace that with a call to bus_get_dev_root() instead, which is what it is there for. Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Len Brown <lenb@kernel.org> Cc: linux-pm@vger.kernel.org Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20230313182918.1312597-3-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
# 44295af5	18-Apr-2023	Sanjay Chandrashekara <sanjayc@nvidia.com>	cpufreq: use correct unit when verify cur freq cpufreq_verify_current_freq checks() if the frequency returned by the hardware has a slight delta with the valid frequency value last set and returns "policy->cur" if the delta is within "1 MHz". In the comparison, "policy->cur" is in "kHz" but it's compared against HZ_PER_MHZ. So, the comparison range becomes "1 GHz". Fix this by comparing against KHZ_PER_MHZ instead of HZ_PER_MHZ. Fixes: f55ae08c8987 ("cpufreq: Avoid unnecessary frequency updates due to mismatch") Signed-off-by: Sanjay Chandrashekara <sanjayc@nvidia.com> [ sumit gupta: Commit message update ] Signed-off-by: Sumit Gupta <sumitg@nvidia.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# a038895e	03-Apr-2023	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: drivers with target_index() must set freq_table Since the cpufreq core directly uses freq_table, for cpufreq drivers that set their target_index() callback, make it mandatory for them to set the same. Since this is set per policy and normally from policy->init(), do this from cpufreq_table_validate_and_sort() which gets called right after ->init(). Reported-by: Yajun Deng <yajun.deng@linux.dev> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 877d5cd2	20-Mar-2023	qinyu <qinyu32@huawei.com>	cpufreq: warn about invalid vals to scaling_max/min_freq interfaces When echo an invalid val to scaling_min_freq: > echo 123abc123 > scaling_min_freq It looks weird to have a return val of 0: > echo $? > 0 Sane people won't echo strings like that into these interfaces but fuzz tests may do. Also, maybe it's better to inform people if input is invalid. After this: > echo 123abc123 > scaling_min_freq > -bash: echo: write error: Invalid argument Signed-off-by: qinyu <qinyu32@huawei.com> Tested-by: zhangxiaofeng <zhangxiaofeng46@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 108fcad9	07-Feb-2023	Thomas Weißschuh <linux@weissschuh.net>	cpufreq: Make kobj_type structure constant Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definition to prevent modification at runtime. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# dd329e1e	07-Feb-2023	Uwe Kleine-König <u.kleine-koenig@pengutronix.de>	cpufreq: Make cpufreq_unregister_driver() return void All but a few drivers ignore the return value of cpufreq_unregister_driver(). Those few that don't only call it after cpufreq_register_driver() succeeded, in which case the call doesn't fail. Make the function return no value and add a WARN_ON for the case that the function is called in an invalid situation (i.e. without a previous successful call to cpufreq_register_driver()). Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Acked-by: Florian Fainelli <f.fainelli@gmail.com> # brcmstb-avs-cpufreq.c Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 5c510548	10-Nov-2022	Yongqiang Liu <liuyongqiang13@huawei.com>	cpufreq: Init completion before kobject_init_and_add() In cpufreq_policy_alloc(), it will call uninitialed completion in cpufreq_sysfs_release() when kobject_init_and_add() fails. And that will cause a crash such as the following page fault in complete: BUG: unable to handle page fault for address: fffffffffffffff8 [..] RIP: 0010:complete+0x98/0x1f0 [..] Call Trace: kobject_put+0x1be/0x4c0 cpufreq_online.cold+0xee/0x1fd cpufreq_add_dev+0x183/0x1e0 subsys_interface_register+0x3f5/0x4e0 cpufreq_register_driver+0x3b7/0x670 acpi_cpufreq_init+0x56c/0x1000 [acpi_cpufreq] do_one_initcall+0x13d/0x780 do_init_module+0x1c3/0x630 load_module+0x6e67/0x73b0 __do_sys_finit_module+0x181/0x240 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd Fixes: 4ebe36c94aed ("cpufreq: Fix kobject memleak") Signed-off-by: Yongqiang Liu <liuyongqiang13@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: 5.2+ <stable@vger.kernel.org> # 5.2+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 6ca7076f	16-Aug-2022	Lukasz Luba <lukasz.luba@arm.com>	cpufreq: check only freq_table in __resolve_freq() There is no need to check if the cpufreq driver implements callback cpufreq_driver::target_index. The logic in the __resolve_freq uses the frequency table available in the policy. It doesn't matter if the driver provides 'target_index' or 'target' callback. It just has to populate the 'policy->freq_table'. Thus, check only frequency table during the frequency resolving call. Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Lukasz Luba <lukasz.luba@arm.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 68315f1a	04-Jul-2022	Pierre Gondois <pierre.gondois@arm.com>	cpufreq: Change order of online() CB and policy->cpus modification From a state where all policy->related_cpus are offline, putting one of the policy's CPU back online re-activates the policy by: 1. Calling cpufreq_driver->online() 2. Setting the CPU in policy->cpus qcom_cpufreq_hw_cpu_online() makes use of policy->cpus. Thus 1. and 2. should be inverted to avoid having a policy->cpus empty. The qcom-cpufreq-hw is the only driver affected by this. Signed-off-by: Pierre Gondois <pierre.gondois@arm.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
# a2f6a7ac	07-Jul-2022	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Warn users while freeing active policy With the new design in place, the show() and store() callbacks check if the policy is active or not before proceeding any further to avoid potential races. And in order to guarantee that cpufreq_policy_free() must be called after clearing the policy->cpus mask, i.e. by marking the policy inactive. In order to avoid introducing a bug around this later, print a warning message if we end up freeing an active policy. Also update cpufreq_online() a bit to make sure we clear the cpus mask for each error case before calling cpufreq_policy_free(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 9ab9b9d3	26-May-2022	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Drop unnecessary cpus locking from store() This change was introduced long back by commit 4f750c930822 ("cpufreq: Synchronize the cpufreq store_*() routines with CPU hotplug"). Since then, both cpufreq and hotplug core have been reworked and have much better locking in place. The race mentioned in commit 4f750c930822 isn't possible anymore. Drop the unnecessary locking. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 336e5128	26-May-2022	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Optimize cpufreq_show_cpus() Instead of specially adding a space for each CPU, except the first one, lets add space for each of them and remove it at the end. Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 514ff1bc	15-May-2022	Schspa Shi <schspa@gmail.com>	cpufreq: make interface functions and lock holding state clear cpufreq_offline() calls offline() and exit() under the policy rwsem But they are called outside the rwsem in cpufreq_online(). Make cpufreq_online() call offline() and exit() as well as online() and init() under the policy rwsem to achieve a clear lock relationship. All of the init() and online() implementations in the tree only initialize the policy object without attempting to acquire the policy rwsem and they won't call cpufreq APIs attempting to acquire it. Signed-off-by: Schspa Shi <schspa@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# d4627a28	15-May-2022	Schspa Shi <schspa@gmail.com>	cpufreq: Abort show()/store() for half-initialized policies If policy initialization fails after the sysfs files are created, there is a possibility to end up running show()/store() callbacks for half-initialized policies, which may have unpredictable outcomes. Abort show()/store() in such a case by making sure the policy is active. Also dectivate the policy on such failures. Signed-off-by: Schspa Shi <schspa@gmail.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# f339f354	11-May-2022	Rafael J. Wysocki <rafael.j.wysocki@intel.com>	cpufreq: Rearrange locking in cpufreq_remove_dev() Currently, cpufreq_remove_dev() invokes the ->exit() driver callback without holding the policy rwsem which is inconsistent with what happens if ->exit() is invoked directly from cpufreq_offline(). It also manipulates the real_cpus mask and removes the CPU device symlink without holding the policy rwsem, but cpufreq_offline() holds the rwsem around the modifications thereof. For consistency, modify cpufreq_remove_dev() to hold the policy rwsem until the ->exit() callback has been called (or it has been determined that it is not necessary to call it). Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
# fddd8f86	11-May-2022	Rafael J. Wysocki <rafael.j.wysocki@intel.com>	cpufreq: Split cpufreq_offline() Split the "core" part running under the policy rwsem out of cpufreq_offline() to allow the locking in cpufreq_remove_dev() to be rearranged more easily. As a side-effect this eliminates the unlock label that's not needed any more. No expected functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
# e1e962c5	11-May-2022	Rafael J. Wysocki <rafael.j.wysocki@intel.com>	cpufreq: Reorganize checks in cpufreq_offline() Notice that cpufreq_offline() only needs to check policy_is_inactive() once and rearrange the code in there to make that happen. No expected functional impact. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
# 5c84c1b8	11-May-2022	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Clear real_cpus mask from remove_cpu_dev_symlink() add_cpu_dev_symlink() is responsible for setting the CPUs in the real_cpus mask, the reverse of which should be done from remove_cpu_dev_symlink() to make it look clean and avoid any breakage later on. Move the call to clear the mask to remove_cpu_dev_symlink(). Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 85f0e42b	08-May-2022	Viresh Kumar <viresh.kumar@linaro.org>	Revert "cpufreq: Fix possible race in cpufreq online error path" This reverts commit f346e96267cd76175d6c201b40f770c0116a8a04. The commit tried to fix a possible real bug but it made it even worse. The fix was simply buggy as now an error out to out_offline_policy or out_exit_policy will try to release a semaphore which was never taken in the first place. This works fine only if we failed late, i.e. via out_destroy_policy. Fixes: f346e96267cd ("cpufreq: Fix possible race in cpufreq online error path") Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# f55ae08c	04-May-2022	Viresh Kumar <viresh.kumar@linaro.org>	cpufreq: Avoid unnecessary frequency updates due to mismatch For some platforms, the frequency returned by hardware may be slightly different from what is provided in the frequency table. For example, hardware may return 499 MHz instead of 500 MHz. In such cases it is better to avoid getting into unnecessary frequency updates, as we may end up switching policy->cur between the two and sending unnecessary pre/post update notifications, etc. This patch has chosen allows the hardware frequency and table frequency to deviate by 1 MHz for now, we may want to increase it a bit later on if someone still complains. Reported-by: Rex-BC Chen <rex-bc.chen@mediatek.com> Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org> Tested-by: Jia-wei Chang <jia-wei.chang@mediatek.com> Reviewed-by: Matthias Brugger <matthias.bgg@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# f346e962	20-Apr-2022	Schspa Shi <schspa@gmail.com>	cpufreq: Fix possible race in cpufreq online error path When cpufreq online fails, the policy->cpus mask is not cleared and policy->rwsem is released too early, so the driver can be invoked via the cpuinfo_cur_freq sysfs attribute while its ->offline() or ->exit() callbacks are being run. Take policy->clk as an example: static int cpufreq_online(unsigned int cpu) { ... // policy->cpus != 0 at this time down_write(&policy->rwsem); ret = cpufreq_add_dev_interface(policy); up_write(&policy->rwsem); return 0; out_destroy_policy: for_each_cpu(j, policy->real_cpus) remove_cpu_dev_symlink(policy, get_cpu_device(j)); up_write(&policy->rwsem); ... out_exit_policy: if (cpufreq_driver->exit) cpufreq_driver->exit(policy); clk_put(policy->clk); // policy->clk is a wild pointer ... ^ \| Another process access __cpufreq_get cpufreq_verify_current_freq cpufreq_generic_get // acces wild pointer of policy->clk; \| \| out_offline_policy: \| cpufreq_policy_free(policy); \| // deleted here, and will wait for no body reference cpufreq_policy_put_kobj(policy); } Address this by modifying cpufreq_online() to release policy->rwsem in the error path after the driver callbacks have run and to clear policy->cpus before releasing the semaphore. Fixes: 7106e02baed4 ("cpufreq: release policy->rwsem on error") Signed-off-by: Schspa Shi <schspa@gmail.com> [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 4f774c4a	27-Jan-2022	Bjorn Andersson <bjorn.andersson@linaro.org>	cpufreq: Reintroduce ready() callback This effectively revert '4bf8e582119e ("cpufreq: Remove ready() callback")', in order to reintroduce the ready callback. This is needed in order to be able to leave the thermal pressure interrupts in the Qualcomm CPUfreq driver disabled during initialization, so that it doesn't fire while related_cpus are still 0. Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org> [ Viresh: Added the Chinese translation as well and updated commit msg ] Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
# fe262d5c	28-Dec-2021	Greg Kroah-Hartman <gregkh@linuxfoundation.org>	cpufreq: use default_groups in kobj_type There are currently 2 ways to create a set of sysfs files for a kobj_type, through the default_attrs field, and the default_groups field. Move the cpufreq code to use default_groups field which has been the preferred way since aa30f47cf666 ("kobject: Add support for default attribute groups to kobj_type") so that we can soon get rid of the obsolete default_attrs field. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 521223d8	16-Dec-2021	Rafael J. Wysocki <rafael.j.wysocki@intel.com>	cpufreq: Fix initialization of min and max frequency QoS requests The min and max frequency QoS requests in the cpufreq core are initialized to whatever the current min and max frequency values are at the init time, but if any of these values change later (for example, cpuinfo.max_freq is updated by the driver), these initial request values will be limiting the CPU frequency unnecessarily unless they are changed by user space via sysfs. To address this, initialize min_freq_req and max_freq_req to FREQ_QOS_MIN_DEFAULT_VALUE and FREQ_QOS_MAX_DEFAULT_VALUE, respectively, so they don't really limit anything until user space updates them. Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Tested-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 1e81d3e0	01-Dec-2021	Tang Yizhou <tangyizhou@huawei.com>	cpufreq: Fix a comment in cpufreq_policy_free Make the comment in blocking_notifier_call_chain() easier to understand. Signed-off-by: Tang Yizhou <tangyizhou@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 2c1b5a84	29-Nov-2021	Xiongfeng Wang <wangxiongfeng2@huawei.com>	cpufreq: Fix get_cpu_device() failure in add_cpu_dev_symlink() When I hot added a CPU, I found 'cpufreq' directory was not created below /sys/devices/system/cpu/cpuX/. It is because get_cpu_device() failed in add_cpu_dev_symlink(). cpufreq_add_dev() is the .add_dev callback of a CPU subsys interface. It will be called when the CPU device registered into the system. The call chain is as follows: register_cpu() ->device_register() ->device_add() ->bus_probe_device() ->cpufreq_add_dev() But only after the CPU device has been registered, we can get the CPU device by get_cpu_device(), otherwise it will return NULL. Since we already have the CPU device in cpufreq_add_dev(), pass it to add_cpu_dev_symlink(). I noticed that the 'kobj' of the CPU device has been added into the system before cpufreq_add_dev(). Fixes: 2f0ba790df51 ("cpufreq: Fix creation of symbolic links to policy directories") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Cc: All applicable <stable@vger.kernel.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# b894d20e	08-Sep-2021	Vincent Donnefort <vincent.donnefort@arm.com>	cpufreq: Use CPUFREQ_RELATION_E in DVFS governors Let the governors schedutil, conservative and ondemand to work, if possible on efficient frequencies only. Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
# 1f39fa0d	08-Sep-2021	Vincent Donnefort <vincent.donnefort@arm.com>	cpufreq: Introducing CPUFREQ_RELATION_E This newly introduced flag can be applied by a governor to a CPUFreq relation, when looking for a frequency within the policy table. The resolution would then only walk through efficient frequencies. Even with the flag set, the policy max limit will still be honoured. If no efficient frequencies can be found within the limits of the policy, an inefficient one would be returned. Signed-off-by: Vincent Donnefort <vincent.donnefort@arm.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>