Cross Reference: /linux-master/drivers/gpu/drm/amd/pm/swsmu/smu

History log of /linux-master/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
Revision	Date	Author	Comments
# 0cd2bc06	26-Apr-2023	Yang Wang <KevinYang.Wang@amd.com>	drm/amd/pm: enable amdgpu smu send message log v1: enable amdgpu smu driver message log. v2: add smu/pmfw response value into debug log. Signed-off-by: Yang Wang <KevinYang.Wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# a62503ca	20-Dec-2023	Asad Kamal <asad.kamal@amd.com>	drm/amd/pm: Add gpu_metrics_v1_5 Add new gpu_metrics_v1_5 to acquire vcn/jpeg activity & pcie nak error counters Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 43d7e8b0	20-Dec-2023	Asad Kamal <asad.kamal@amd.com>	drm/amd/pm: Add gpu_metrics_v1_5 Add new gpu_metrics_v1_5 to acquire vcn/jpeg activity & pcie nak error counters Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Le Ma <le.ma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 34ec3ced	30-Oct-2023	Li Ma <li.ma@amd.com>	drm/amd/swsmu: update smu v14_0_0 driver if and metrics table Update driver if headers and metrics table in smu v14_0_0 after smu fw promotion. Drop the legacy metrics table and add warning of checking pmfw version. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 011d99ee	27-Sep-2023	Asad Kamal <asad.kamal@amd.com>	drm/amd/pm: Add gpu_metrics_v1_4 Add new gpu_metrics_v1_4 to acquire XGMI data transfer, pcie bandwidth & Clock lock status v2: Add pcie error counter to gpu metric table v1_4 Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# f1d1abd6	10-Aug-2023	Asad Kamal <asad.kamal@amd.com>	drm/amd/pm: Update pci link speed for smu v13.0.6 Update pcie link speed registers for smu v13.0.6 & populate gpu metric table with pcie link speed rather than gen for smu v13_0_0, smu v13_0_6 & smu v13_0_7 v2: Update ESM register address Used macro to convert pcie gen to speed v3: Chaged macro to inline function for pcie gen to speed Signed-off-by: Asad Kamal <asad.kamal@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 7406f963	24-Jul-2023	Ran Sun <sunran001@208suo.com>	drm/amd/pm: Clean up errors in arcturus_ppt.c Fix the following errors reported by checkpatch: ERROR: "foo* bar" should be "foo *bar" ERROR: spaces required around that '=' (ctx:VxW) ERROR: space prohibited before that close parenthesis ')' Signed-off-by: Ran Sun <sunran001@208suo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 41cec40b	30-May-2023	Wenyou Yang <WenYou.Yang@amd.com>	drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics To acquire the voltage and current info from gpu_metrics interface, but gpu_metrics_v2_3 doesn't contain them, and to be backward compatible, add new gpu_metrics_v2_4 structure. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Wenyou Yang <WenYou.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 31865e96	16-Feb-2023	Perry Yuan <perry.yuan@amd.com>	drm/amdgpu/pm: add capped/uncapped power profile modes Capped and uncapped workload types switching are supported on Vangogh, User can switch the power profile and check current type with below commands. 1) switch to capped mode: `# echo 8 > /sys/class/drm/card0/device/pp_power_profile_mode` 2) switch to uncapped mode: `# echo 9 > /sys/class/drm/card0/device/pp_power_profile_mode` 3) check current mode: $ cat /sys/class/drm/card0/device/pp_power_profile_mode 1 3D_FULL_SCREEN 3 VIDEO 4 VR 5 COMPUTE 6 CUSTOM 8 CAPPED 9 UNCAPPED* Acked-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Perry Yuan <perry.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 51097df1	09-Jan-2023	Candice Li <candice.li@amd.com>	drm/amd/pm: Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10 Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 60cfad32	07-Nov-2022	Kenneth Feng <kenneth.feng@amd.com>	drm/amd/pm: enable mode1 reset on smu_v13_0_10 enable mode1 reset and prioritize debug port on smu_v13_0_10 as a more reliable message processing v2 - move mode1 reset callback to smu_v13_0_0_ppt.c Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 0d6516ef	05-Sep-2022	Li Ma <li.ma@amd.com>	drm/amd/pm:add new gpu_metrics_v2_3 to acquire average temperature info Add new gpu_metrics_v2_3 to acquire average temperature info from SMU metrics. To acquire average temp info from gpu_metrics interface, but gpu_metrics_v2_2 only has members to show current temp info. --- v1: Only add average_temperature_gfx in gpu_metrics_v2_3. v2: Add average temp members for soc, core and l3 in gpu_metrics_v2_3 and put these new members at the end of gpu_metrics_v2_3. Add operation to read average temp info from metrics table. v3: Merge v1 and v2 and rename the patch. v4: Merge v3. Add firmware version judgment in vangogh_common_get_gpu_metrics to maintain backward compatibility and rename the patch. "return ret" on error scenario in smu_cmn_get_smc_version. Signed-off-by: Li Ma <li.ma@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 6f73d676	25-May-2022	Evan Quan <evan.quan@amd.com>	drm/amd/pm: optimize the interface for dpm feature status query Drop extra CMN2ASIC_MAPPING_FEATURE transform. Also some cosmetic fixes for better readability. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 7a09f61f	25-May-2022	Alex Deucher <alexander.deucher@amd.com>	drm/amdgpu/swsmu: use new register offsets for smu_cmn.c Use the per asic offsets so the we don't have to have asic specific logic in the common code. Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 8763e4c1	17-May-2022	Huang Rui <ray.huang@amd.com>	drm/amdgpu/pm: update MP v13_0_4 smu message register marco Update MP v13_0_4 register macro for SMU message v2: squash in missed case (Alex) Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Tim Huang <Tim.Huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 334682ae	21-Apr-2022	Kenneth Feng <kenneth.feng@amd.com>	drm/amd/pm: enable workload type change on smu_v13_0_7 enable workload type change on smu_v13_0_7 v2: squash in out of bounds fix (Alex) Signed-off-by: Kenneth Feng <kenneth.feng@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 276c03a0	06-Apr-2022	Evan Quan <evan.quan@amd.com>	drm/amd/smu: Update SMU13 support for SMU 13.0.0 Modify the common smu13 code and add a new smu 13.0.0 ppt file to handle the smu 13.0.0 specific configuration. v2: squash in typo fix in profile name Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 6a2d7a22	06-Apr-2022	Evan Quan <evan.quan@amd.com>	drm/amd/pm: enable the support for retrieving combo pptable We need to relay on this way to get the raw PPTable when SCPM feature is enabled. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# f24044bd	05-Apr-2022	Darren Powell <darren.powell@amd.com>	amdgpu/pm: Clarify documentation of error handling in send_smc_mesg Clarify the smu_cmn_send_smc_msg_with_param documentation to mention two cases exist where messages are silently dropped with no error returned. These cases occur in unusual situations where either: 1. the message type is not allowed to a virtual GPU, or 2. a PCI recovery is underway and the HW is not yet in sync with the SW For more details see commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter") commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state") (v2) Reworked with suggestions from Luben & Paul (v3) Updated wording as per Luben's feedback Corrected error stating all messages denied on virtual GPU (each GPU has mask of which messages are allowed) Signed-off-by: Darren Powell <darren.powell@amd.com> Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 508a47d4	14-Mar-2022	Dan Carpenter <dan.carpenter@oracle.com>	drm/amd/pm: fix indenting in __smu_cmn_reg_print_error() Smatch complains that the dev_err_ratelimited() is indented one tab more than the surrounding lines. drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:174 __smu_cmn_reg_print_error() warn: inconsistent indenting It looks like it's not a bug, just that the indenting needs to be cleaned up. Reviewed-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 8718ca1d	24-Feb-2022	Lijo Lazar <lijo.lazar@amd.com>	drm/amd/pm: Send message when resp status is 0xFC When PMFW is really busy, it will respond with 0xFC. However, it doesn't change the response register state when it becomes free. Driver should retry and proceed to send message if the response status is 0xFC. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 7c916f95	06-Mar-2022	Yifan Zhang <yifan1.zhang@amd.com>	drm/amdgpu: change registers in error checking for smu 13.0.5 smu 13.0.5 use new registers for smu msg and param. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# a1235a01	22-Feb-2022	Lijo Lazar <lijo.lazar@amd.com>	drm/amd/pm: Fix missing prototype warning Fix below warning warning: no previous prototype for '__smu_get_enabled_features' [-Wmissing-prototypes] Fixes: f141e251474d67 ("drm/amd/pm: validate SMU feature enable message for getting feature enabled mask") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# f141e251	10-Feb-2022	Prike Liang <Prike.Liang@amd.com>	drm/amd/pm: validate SMU feature enable message for getting feature enabled mask There's always miss the SMU feature enabled checked in the NPI phase, so let validate the SMU feature enable message directly rather than add more and more MP1 version check. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Lijo Lazar <Lijo.Lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# cec24112	22-Jan-2022	Yifan Zhang <yifan1.zhang@amd.com>	drm/amd/pm: update smc message sequence for smu 13.0.5 this patch updates smc message sequence for smu 13.0.5. Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# db090ff8	12-Nov-2021	Prike Liang <Prike.Liang@amd.com>	drm/amd/pm: Add support for MP1 13.0.8 Set smu sw function and enable swSMU support for MP1. Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# cc188a73	08-Feb-2022	Evan Quan <evan.quan@amd.com>	drm/amd/pm: fix enabled features retrieving on Renoir and Cyan Skillfish For Cyan Skillfish and Renoir, there is no interface provided by PMFW to retrieve the enabled features. So, we assume all features are enabled. Fixes: 7ade3ca9cdb5 ("drm/amd/pm: correct the usage for 'supported' member of smu_feature structure") Signed-off-by: Evan Quan <evan.quan@amd.com> Tested-by: Nathan Chancellor <nathan@kernel.org> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# f69c15e1	08-Dec-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: revise the implementation of smu_cmn_disable_all_features_with_exception As there is no internal cache for enabled ppfeatures now. Thus the 2nd parameter will be not needed any more. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# a89ef044	08-Dec-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: avoid consecutive retrieving for enabled ppfeatures As the enabled ppfeatures are just retrieved ahead. We can use that directly instead of retrieving again and again. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 3c6591e9	08-Dec-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: drop the cache for enabled ppfeatures The following scenarios make the driver cache for enabled ppfeatures outdated and invalid: - Other tools interact with PMFW to change the enabled ppfeatures. - PMFW may enable/disable some features behind driver's back. E.g. for sienna_cichild, on gfxoff entering, PMFW will disable gfx related DPM features. All those are performed without driver's notice. Also considering driver does not actually interact with PMFW such frequently, the benefit brought by such cache is very limited. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 2d282665	07-Dec-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: update the data type for retrieving enabled ppfeatures Use uint64_t instead of an array of uint32_t. This can avoid some non-necessary intermediate uint32_t -> uint64_t conversions. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 5af779ad	07-Dec-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: unify the interface for retrieving enabled ppfeatures Instead of having two which do the same thing. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# bd425711	07-Dec-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: correct the way for retrieving enabled ppfeatures on Renoir As other dGPU asics, Renoir should use smu_cmn_get_enabled_mask() for that job. Signed-off-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 1f2cf08a	28-Nov-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: drop unneeded feature->mutex As all those related APIs are already well protected by adev->pm.mutex. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# da11407f	28-Nov-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: drop unneeded smu->metrics_lock As all those related APIs are already well protected by adev->pm.mutex and smu->message_lock. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 7e31a858	12-Dec-2021	Evan Quan <evan.quan@amd.com>	drm/amdgpu: move smu_debug_mask to a more proper place As the smu_context will be invisible from outside(of power). Also, the smu_debug_mask can be shared around all power code instead of some specific framework(swSMU) only. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 6ff7fddb	29-Nov-2021	Lang Yu <lang.yu@amd.com>	drm/amdgpu: add support for SMU debug option SMU firmware expects the driver maintains error context and doesn't interact with SMU any more when SMU errors occurred. That will aid in debugging SMU firmware issues. Add SMU debug option support for this request, it can be enabled or disabled via amdgpu_smu_debug debugfs file. Use a 32-bit mask to indicate corresponding debug modes. Currently, only one mode(HALT_ON_ERROR) is supported. When enabled, it brings hardware to a kind of halt state so that no one can touch it any more in the envent of SMU errors. The dirver interacts with SMU via sending messages. And threre are three ways to sending messages to SMU in current implementation. Handle them respectively as following: 1, smu_cmn_send_smc_msg_with_param() for normal timeout cases Halt on any error. 2, smu_cmn_send_msg_without_waiting()/smu_cmn_wait_for_response() for longer timeout cases Halt on errors apart from ETIME. Otherwise this way won't work. Let the user handle ETIME error in such a case. 3, smu_cmn_send_msg_without_waiting() for no waiting cases Halt on errors apart from ETIME. Otherwise second way won't work. == Command Guide == 1, enable HALT_ON_ERROR mode # echo 0x1 > /sys/kernel/debug/dri/0/amdgpu_smu_debug 2, disable HALT_ON_ERROR mode # echo 0x0 > /sys/kernel/debug/dri/0/amdgpu_smu_debug v5: - Use bit mask to allow more debug features.(Evan) - Use WRAN() instead of BUG().(Evan) v4: - Set to halt state instead of a simple hang.(Christian) v3: - Use debugfs_create_bool().(Christian) - Put variable into smu_context struct. - Don't resend command when timeout. v2: - Resend command when timeout.(Lijo) - Use debugfs file instead of module parameter. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# bbe04dec	07-Dec-2021	Isabella Basso <isabbasso@riseup.net>	drm/amd: fix improper docstring syntax This fixes various warnings relating to erroneous docstring syntax, of which some are listed below: warning: Function parameter or member 'adev' not described in 'amdgpu_atomfirmware_ras_rom_addr' ... warning: expecting prototype for amdgpu_atpx_validate_functions(). Prototype was for amdgpu_atpx_validate() instead ... warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new' ... warning: Cannot understand * @kfd_get_cu_occupancy - Collect number of waves in-flight on this device ... warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Signed-off-by: Isabella Basso <isabbasso@riseup.net> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# e771d71d	17-Nov-2021	Luben Tuikov <luben.tuikov@amd.com>	drm/amd/pm: Print the error on command submission Print the error on command submission immediately after submitting to the SMU. This is rate-limited. It helps to immediately know there was an error on command submission, rather than leave it up to clients to report the error, as sometimes they do not. Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# ca4b32bb	17-Nov-2021	Luben Tuikov <luben.tuikov@amd.com>	drm/amd/pm: Add debug prints Add prints where there are none and none are printed in the callee. Remove the word "previous" from comment and print to make it shorter and avoid confusion in various prints. Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 8bd1b7c2	10-Nov-2021	Luben Tuikov <luben.tuikov@amd.com>	drm/amd/pm: Enhanced reporting also for a stuck command Also print the message index and parameter of the stuck command. Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 38a268b3	10-Nov-2021	Luben Tuikov <luben.tuikov@amd.com>	drm/amd/pm: Enhanced reporting also for a stuck command Also print the message index and parameter of the stuck command. Cc: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# be68d44b	08-Sep-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver Current RUNPM mechanism relies on PMFW to master the timing for BACO in/exit. And that needs cooperation from sound driver for dstate change notification for function 1(audio). Otherwise(on sound driver missing), BACO cannot be kicked in correctly and hang will be observed on RUNPM exit. By switching back to legacy message way on sound driver missing, we are able to fix the runpm hang observed for the scenario below: amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-and-tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 8b514e89	08-Sep-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver Current RUNPM mechanism relies on PMFW to master the timing for BACO in/exit. And that needs cooperation from sound driver for dstate change notification for function 1(audio). Otherwise(on sound driver missing), BACO cannot be kicked in correctly and hang will be observed on RUNPM exit. By switching back to legacy message way on sound driver missing, we are able to fix the runpm hang observed for the scenario below: amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded Signed-off-by: Evan Quan <evan.quan@amd.com> Reported-and-tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
# 828db598	30-Jun-2021	Darren Powell <darren.powell@amd.com>	amdgpu/pm: Replace navi10 usage of sprintf with sysfs_emit initial modification of files smu_cmn.c navi10_ppt.c === Test === AMDGPU_PCI_ADDR=`lspci -nn \| grep "VGA\\|Display" \| cut -d " " -f 1` AMDGPU_HWMON=`ls -la /sys/class/hwmon \| grep $AMDGPU_PCI_ADDR \| awk '{print $9}'` HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON} LOGFILE=pp_printf.test.log lspci -nn \| grep "VGA\\|Display" > $LOGFILE FILES="pp_dpm_sclk pp_sclk_od pp_mclk_od pp_dpm_pcie pp_od_clk_voltage pp_power_profile_mode " for f in $FILES do echo === $f === >> $LOGFILE cat $HWMON_DIR/device/$f >> $LOGFILE done cat $LOGFILE Signed-off-by: Darren Powell <darren.powell@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 544dcd74	28-Jul-2021	Luben Tuikov <luben.tuikov@amd.com>	drm/amd/pm: Fix a bug in semaphore double-lock Fix a bug in smu_cmn_send_msg_without_waiting() in that this function does not need to take the smu->message_lock mutex in order to send a message down to the SMU. The mutex is acquired by the caller of this function instead. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Changfeng Zhu <Changfeng.Zhu@amd.com> Cc: Huang Rui <ray.huang@amd.com> Fixes: 5810323ba69289 ("drm/amd/pm: Fix a bug communicating with the SMU (v5)") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 5810323b	09-Jul-2021	Luben Tuikov <luben.tuikov@amd.com>	drm/amd/pm: Fix a bug communicating with the SMU (v5) This fixes a bug which if we probe a non-existing I2C device, and the SMU returns 0xFF, from then on we can never communicate with the SMU, because the code before this patch reads and interprets 0xFF as a terminal error, and thus we never write 0 into register 90 to clear the status (and subsequently send a new command to the SMU.) It is not an error that the SMU returns status 0xFF. This means that the SMU executed the last command successfully (execution status), but the command result is an error of some sort (execution result), depending on what the command was. When doing a status check of the SMU, before we send a new command, the only status which precludes us from sending a new command is 0--the SMU hasn't finished executing a previous command, and 0xFC--the SMU is busy. This bug was seen as the following line in the kernel log, amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state! when subsequent SMU commands, not necessarily related to I2C, were sent to the SMU. This patch fixes this bug. v2: Add a comment to the description of __smu_cmn_poll_stat() to explain why we're NOT defining the SMU FW return codes as macros, but are instead hard-coding them. Such a change, can be followed up by a subsequent patch. v3: The changes are, a) Add comments to break labels in __smu_cmn_reg2errno(). b) When an unknown/unspecified/undefined result is returned back from the SMU, map that to -EREMOTEIO, to distinguish failure at the SMU FW. c) Add kernel-doc to smu_cmn_send_msg_without_waiting(), smu_cmn_wait_for_response(), smu_cmn_send_smc_msg_with_param(). d) In smu_cmn_send_smc_msg_with_param(), since we wait for completion of the command, if the result of the completion is undefined/unknown/unspecified, we print that to the kernel log. v4: a) Add macros as requested, though redundant, to be removed when SMU consolidates for all ASICs--see comment in code. b) Get out if the SMU code is unknown. v5: Rename the macro names. Cc: Alex Deucher <Alexander.Deucher@amd.com> Cc: Evan Quan <evan.quan@amd.com> Cc: Lijo Lazar <Lijo.Lazar@amd.com> Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before issuing message") Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 1e75be2b	07-Dec-2020	Evan Quan <evan.quan@amd.com>	drm/amd/pm: update the cached dpm feature status For some ASICs, the real dpm feature disablement job is handled by PMFW during baco reset and custom pptable loading. Cached dpm feature status need to be updated to pair that. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# c23083cd	07-Jun-2021	Graham Sider <Graham.Sider@amd.com>	drm/amd/pm: Add common throttler translation func Defines smu_cmn_get_indep_throttler_status which performs ASIC independent translation given a corresponding lookup table. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 22a7dcf5	03-Jun-2021	Graham Sider <Graham.Sider@amd.com>	drm/amd/pm: Add u64 throttler status field to gpu_metrics This patch set adds support for a new ASIC independant u64 throttler status field (indep_throttle_status). Piggybacks off the gpu_metrics_v1_3 bump and similarly bumps gpu_metrics_v2 version (to v2_2) to add field. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 7afefb81	21-May-2021	Andrey Grodzovsky <andrey.grodzovsky@amd.com>	drm/amdgpu: Rename flag which prevents HW access Make it's name not feature but function descriptive. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210521204122.762288-1-andrey.grodzovsky@amd.com
# 0b7db431	14-May-2021	David M Nieto <david.nieto@amd.com>	drm/amdgpu/pm: Update metrics table (v2) v2: removed static dpm and frequency ranges from table expand metrics table with voltages and frequency ranges Signed-off-by: David M Nieto <david.nieto@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# cfd053be	26-Apr-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: expose pmfw attached timestamp on Aldebaran Available with 68.18.0 and later PMFWs. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 1e4a53de	09-Apr-2021	Darren Powell <darren.powell@amd.com>	amdgpu/pm: add extra info to SMU msg pre-check failed message Insert the value of the response to error message emitted when the SMU msg pre-check failes Signed-off-by: Darren Powell <darren.powell@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 05516264	07-Apr-2021	charles sun <charles.sun@amd.com>	drm/amd/pm: increase time out value when sending msg to SMU when do S3 stress, low rate that PowerUpVcn message will get response more than 1s, so here increase the timeout to 2s Signed-off-by: charles sun <charles.sun@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Acked-by: Alex Deucher <Alexander.Deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# e42569d0	16-Mar-2021	Lijo Lazar <lijo.lazar@amd.com>	drm/amd/pm: Modify mode2 msg sequence on aldebaran v1: During mode2 reset, PCI space is lost after message is sent. Restore PCI space before waiting for response from firmware. v2: Move mode2 sequence to aldebaran and update PMFW version. Handle generic sequence in smu13 without PMFW version check. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 1689fca0	16-Mar-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: fix Navi1x runtime resume failure V2 The RLC was put into a wrong state on runtime suspend. Thus the RLC autoload will fail on the succeeding runtime resume. By adding an intermediate PPSMC_MSG_PrepareMp1ForUnload(some GC hard reset involved, designed for PnP), we can bring RLC back into the desired state. V2: integrate INTERRUPTS_ENABLED flag clearing into current mp1 state set routines Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 4215a119	25-Feb-2021	Horace Chen <horace.chen@amd.com>	drm/amdgpu: enable one vf mode on sienna cichlid vf sienna cichlid needs one vf mode which allows vf to set and get clock status from guest vm. So now expose the required interface and allow some smu request on VF mode. Also since this asic blocked direct MMIO access, use KIQ to send SMU request under sriov vf. OD use same command as getting pp table which is not allowed for sienna cichlid, so remove OD feature under sriov vf. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Monk Liu<monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 152bb95c	28-Feb-2021	Evan Quan <evan.quan@amd.com>	drm/amd/pm: update existing gpu_metrics interfaces V2 Update the gpu_metrics interface implementations to use the latest upgraded data structures. V2: fit the data type change of energy_accumulator Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 8addf37c	18-Feb-2021	Nathan Chancellor <nathan@kernel.org>	drm/amd/pm/swsmu: Avoid using structure_size uninitialized in smu_cmn_init_soft_gpu_metrics Clang warns: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:764:2: warning: variable 'structure_size' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] default: ^~~~~~~ drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:770:23: note: uninitialized use occurs here memset(header, 0xFF, structure_size); ^~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:753:25: note: initialize the variable 'structure_size' to silence this warning uint16_t structure_size; ^ = 0 1 warning generated. Return in the default case, as the size of the header will not be known. Fixes: de4b7cd8cb87 ("drm/amd/pm/swsmu: unify the init soft gpu metrics function") Link: https://github.com/ClangBuiltLinux/linux/issues/1304 Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 7b3d19a7	18-Feb-2021	Nathan Chancellor <nathan@kernel.org>	drm/amd/pm/swsmu: Avoid using structure_size uninitialized in smu_cmn_init_soft_gpu_metrics Clang warns: drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:764:2: warning: variable 'structure_size' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] default: ^~~~~~~ drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:770:23: note: uninitialized use occurs here memset(header, 0xFF, structure_size); ^~~~~~~~~~~~~~ drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:753:25: note: initialize the variable 'structure_size' to silence this warning uint16_t structure_size; ^ = 0 1 warning generated. Return in the default case, as the size of the header will not be known. Fixes: de4b7cd8cb87 ("drm/amd/pm/swsmu: unify the init soft gpu metrics function") Link: https://github.com/ClangBuiltLinux/linux/issues/1304 Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# de4b7cd8	03-Feb-2021	Kevin Wang <kevin1.wang@amd.com>	drm/amd/pm/swsmu: unify the init soft gpu metrics function the soft gpu metrics is not asic related data structure. unify them to reduce duplicate code. Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# e484de44	20-Jan-2021	Huang Rui <ray.huang@amd.com>	drm/amd/pm: print the timeout of smc message This patch is to help firmware designer to know the smc message timeout status. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 96673790	12-Jan-2021	Huang Rui <ray.huang@amd.com>	drm/amd/pm: fix the return value of pm message 0 should be right driver return value, 0x1 is the right firmware return value. So switch to 0 at last. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Xiaojian Du <xiaojian.du@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 74353883	02-Dec-2020	Huang Rui <ray.huang@amd.com>	drm/amdgpu: revise the mode2 reset for vangogh PCIE MMIO bar needs to be restored firstly after the reset event triggers. So it's unable to access the registers to wait for response from SMU. Becasue the value of mmMP1_SMN_C2PMSG_90 is invalid at that moment. Signed-off-by: Huang Rui <ray.huang@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 47381540	07-Jan-2021	Huang Rui <ray.huang@amd.com>	drm/amd/pm: don't mark all apu as true on feature mask VHG based APU will support feature mask checking. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 11db224b	08-Jan-2021	Huang Rui <ray.huang@amd.com>	drm/amd/pm: enhance the real response for smu message (v2) The user prefers to know the real response value from C2PMSG 90 register which is written by firmware not -EIO. v2: return C2PMSG 90 value Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 0339258b	26-Nov-2020	Evan Quan <evan.quan@amd.com>	drm/amd/pm: invalidate hdp before CPU access the memory written by GPU To eliminate the possible influence by outdated HDP read cache. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# ac70c6c6	26-Oct-2020	Xiaojian Du <Xiaojian.Du@amd.com>	drm/amd/pm: add one new function to get 32 bit feature mask for vangogh This patch is to add one new function to get 32 bit feature mask for vangogh. Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# a6c42e84	16-Oct-2020	Kevin Wang <kevin1.wang@amd.com>	drm/amd/swsmu: correct wrong feature bit mapping 1. when smc feature bit isn't mapped, the feature state isn't showed on sysfs node of pp_features. 2. add pp_features table title Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 7aeef2aa	16-Oct-2020	Kevin Wang <kevin1.wang@amd.com>	drm/amd/swsmu: correct wrong feature bit mapping 1. when smc feature bit isn't mapped, the feature state isn't showed on sysfs node of pp_features. 2. add pp_features table title Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# 10144762	31-Aug-2020	Evan Quan <evan.quan@amd.com>	drm/amd/pm: postpone SOCCLK/UCLK enablement after DAL initialization(V2) This is needed for Navi1X only. And it may help for display missing or hang issue seen on some high resolution monitors. V2: no UCLK DPM enablement for Navi10 A0 secure SKU Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# bf36b52e	29-Jul-2020	Andrey Grodzovsky <andrey.grodzovsky@amd.com>	drm/amdgpu: Avoid accessing HW when suspending SW state At this point the ASIC is already post reset by the HW/PSP so the HW not in proper state to be configured for suspension, some blocks might be even gated and so best is to avoid touching it. v2: Rename in_dpc to more meaningful name Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
# e098bc96	13-Aug-2020	Evan Quan <evan.quan@amd.com>	drm/amd/pm: optimize the power related source code layout The target is to provide a clear entry point(for power routines). Also this can help to maintain a clear view about the frameworks used on different ASICs. Hopefully all these can make power part more friendly to play with. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>