History log of /linux-master/drivers/gpu/drm/amd/pm/swsmu/smu_cmn.c
Revision Date Author Comments
# 0cd2bc06 26-Apr-2023 Yang Wang <KevinYang.Wang@amd.com>

drm/amd/pm: enable amdgpu smu send message log

v1:
enable amdgpu smu driver message log.

v2:
add smu/pmfw response value into debug log.

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a62503ca 20-Dec-2023 Asad Kamal <asad.kamal@amd.com>

drm/amd/pm: Add gpu_metrics_v1_5

Add new gpu_metrics_v1_5 to acquire vcn/jpeg activity
& pcie nak error counters

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 43d7e8b0 20-Dec-2023 Asad Kamal <asad.kamal@amd.com>

drm/amd/pm: Add gpu_metrics_v1_5

Add new gpu_metrics_v1_5 to acquire vcn/jpeg activity
& pcie nak error counters

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 34ec3ced 30-Oct-2023 Li Ma <li.ma@amd.com>

drm/amd/swsmu: update smu v14_0_0 driver if and metrics table

Update driver if headers and metrics table in smu v14_0_0 after smu fw promotion.
Drop the legacy metrics table and add warning of checking pmfw version.

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 011d99ee 27-Sep-2023 Asad Kamal <asad.kamal@amd.com>

drm/amd/pm: Add gpu_metrics_v1_4

Add new gpu_metrics_v1_4 to acquire XGMI data transfer,
pcie bandwidth & Clock lock status

v2:
Add pcie error counter to gpu metric table v1_4

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f1d1abd6 10-Aug-2023 Asad Kamal <asad.kamal@amd.com>

drm/amd/pm: Update pci link speed for smu v13.0.6

Update pcie link speed registers for smu v13.0.6 &
populate gpu metric table with pcie link speed rather than
gen for smu v13_0_0, smu v13_0_6 & smu v13_0_7

v2:
Update ESM register address
Used macro to convert pcie gen to speed

v3:
Chaged macro to inline function for pcie gen to speed

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7406f963 24-Jul-2023 Ran Sun <sunran001@208suo.com>

drm/amd/pm: Clean up errors in arcturus_ppt.c

Fix the following errors reported by checkpatch:

ERROR: "foo* bar" should be "foo *bar"
ERROR: spaces required around that '=' (ctx:VxW)
ERROR: space prohibited before that close parenthesis ')'

Signed-off-by: Ran Sun <sunran001@208suo.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 41cec40b 30-May-2023 Wenyou Yang <WenYou.Yang@amd.com>

drm/amd/pm: Vangogh: Add new gpu_metrics_v2_4 to acquire gpu_metrics

To acquire the voltage and current info from gpu_metrics interface,
but gpu_metrics_v2_3 doesn't contain them, and to be backward compatible,
add new gpu_metrics_v2_4 structure.

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Wenyou Yang <WenYou.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 31865e96 16-Feb-2023 Perry Yuan <perry.yuan@amd.com>

drm/amdgpu/pm: add capped/uncapped power profile modes

Capped and uncapped workload types switching are supported on Vangogh,
User can switch the power profile and check current type with below commands.

1) switch to capped mode:
`# echo 8 > /sys/class/drm/card0/device/pp_power_profile_mode`

2) switch to uncapped mode:
`# echo 9 > /sys/class/drm/card0/device/pp_power_profile_mode`

3) check current mode:
$ cat /sys/class/drm/card0/device/pp_power_profile_mode
1 3D_FULL_SCREEN
3 VIDEO
4 VR
5 COMPUTE
6 CUSTOM
8 CAPPED
9 UNCAPPED*

Acked-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 51097df1 09-Jan-2023 Candice Li <candice.li@amd.com>

drm/amd/pm: Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10

Support RAS fatal error mode1 reset on smu v13_0_0 and v13_0_10.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 60cfad32 07-Nov-2022 Kenneth Feng <kenneth.feng@amd.com>

drm/amd/pm: enable mode1 reset on smu_v13_0_10

enable mode1 reset and prioritize debug port on smu_v13_0_10
as a more reliable message processing

v2 - move mode1 reset callback to smu_v13_0_0_ppt.c

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0d6516ef 05-Sep-2022 Li Ma <li.ma@amd.com>

drm/amd/pm:add new gpu_metrics_v2_3 to acquire average temperature info

Add new gpu_metrics_v2_3 to acquire average temperature info from SMU metrics. To acquire average temp info from gpu_metrics interface, but gpu_metrics_v2_2 only has members to show current temp info.
---
v1:
Only add average_temperature_gfx in gpu_metrics_v2_3.
v2:
Add average temp members for soc, core and l3 in gpu_metrics_v2_3 and put these new members at the end of gpu_metrics_v2_3. Add operation to read average temp info from metrics table.
v3:
Merge v1 and v2 and rename the patch.
v4:
Merge v3. Add firmware version judgment in vangogh_common_get_gpu_metrics to maintain backward compatibility and rename the patch. "return ret" on error scenario in smu_cmn_get_smc_version.

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6f73d676 25-May-2022 Evan Quan <evan.quan@amd.com>

drm/amd/pm: optimize the interface for dpm feature status query

Drop extra CMN2ASIC_MAPPING_FEATURE transform. Also some cosmetic
fixes for better readability.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7a09f61f 25-May-2022 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/swsmu: use new register offsets for smu_cmn.c

Use the per asic offsets so the we don't have to have
asic specific logic in the common code.

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8763e4c1 17-May-2022 Huang Rui <ray.huang@amd.com>

drm/amdgpu/pm: update MP v13_0_4 smu message register marco

Update MP v13_0_4 register macro for SMU message

v2: squash in missed case (Alex)

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 334682ae 21-Apr-2022 Kenneth Feng <kenneth.feng@amd.com>

drm/amd/pm: enable workload type change on smu_v13_0_7

enable workload type change on smu_v13_0_7

v2: squash in out of bounds fix (Alex)

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 276c03a0 06-Apr-2022 Evan Quan <evan.quan@amd.com>

drm/amd/smu: Update SMU13 support for SMU 13.0.0

Modify the common smu13 code and add a new smu
13.0.0 ppt file to handle the smu 13.0.0 specific
configuration.

v2: squash in typo fix in profile name

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6a2d7a22 06-Apr-2022 Evan Quan <evan.quan@amd.com>

drm/amd/pm: enable the support for retrieving combo pptable

We need to relay on this way to get the raw PPTable when
SCPM feature is enabled.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f24044bd 05-Apr-2022 Darren Powell <darren.powell@amd.com>

amdgpu/pm: Clarify documentation of error handling in send_smc_mesg

Clarify the smu_cmn_send_smc_msg_with_param documentation to mention two
cases exist where messages are silently dropped with no error returned.
These cases occur in unusual situations where either:
1. the message type is not allowed to a virtual GPU, or
2. a PCI recovery is underway and the HW is not yet in sync with the SW

For more details see
commit 4ea5081c82c4 ("drm/amd/powerplay: enable SMC message filter")
commit bf36b52e781d ("drm/amdgpu: Avoid accessing HW when suspending SW state")

(v2)
Reworked with suggestions from Luben & Paul

(v3)
Updated wording as per Luben's feedback
Corrected error stating all messages denied on virtual GPU
(each GPU has mask of which messages are allowed)

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 508a47d4 14-Mar-2022 Dan Carpenter <dan.carpenter@oracle.com>

drm/amd/pm: fix indenting in __smu_cmn_reg_print_error()

Smatch complains that the dev_err_ratelimited() is indented one tab more
than the surrounding lines.

drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:174
__smu_cmn_reg_print_error() warn: inconsistent indenting

It looks like it's not a bug, just that the indenting needs to be cleaned
up.

Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8718ca1d 24-Feb-2022 Lijo Lazar <lijo.lazar@amd.com>

drm/amd/pm: Send message when resp status is 0xFC

When PMFW is really busy, it will respond with 0xFC. However, it doesn't
change the response register state when it becomes free. Driver should
retry and proceed to send message if the response status is 0xFC.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7c916f95 06-Mar-2022 Yifan Zhang <yifan1.zhang@amd.com>

drm/amdgpu: change registers in error checking for smu 13.0.5

smu 13.0.5 use new registers for smu msg and param.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a1235a01 22-Feb-2022 Lijo Lazar <lijo.lazar@amd.com>

drm/amd/pm: Fix missing prototype warning

Fix below warning
warning: no previous prototype for '__smu_get_enabled_features' [-Wmissing-prototypes]

Fixes: f141e251474d67 ("drm/amd/pm: validate SMU feature enable message for getting feature enabled mask")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f141e251 10-Feb-2022 Prike Liang <Prike.Liang@amd.com>

drm/amd/pm: validate SMU feature enable message for getting feature enabled mask

There's always miss the SMU feature enabled checked in the NPI phase,
so let validate the SMU feature enable message directly rather than
add more and more MP1 version check.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Lijo Lazar <Lijo.Lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cec24112 22-Jan-2022 Yifan Zhang <yifan1.zhang@amd.com>

drm/amd/pm: update smc message sequence for smu 13.0.5

this patch updates smc message sequence for smu 13.0.5.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# db090ff8 12-Nov-2021 Prike Liang <Prike.Liang@amd.com>

drm/amd/pm: Add support for MP1 13.0.8

Set smu sw function and enable swSMU support for MP1.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cc188a73 08-Feb-2022 Evan Quan <evan.quan@amd.com>

drm/amd/pm: fix enabled features retrieving on Renoir and Cyan Skillfish

For Cyan Skillfish and Renoir, there is no interface provided by PMFW
to retrieve the enabled features. So, we assume all features are enabled.

Fixes: 7ade3ca9cdb5 ("drm/amd/pm: correct the usage for 'supported' member of smu_feature structure")

Signed-off-by: Evan Quan <evan.quan@amd.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f69c15e1 08-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: revise the implementation of smu_cmn_disable_all_features_with_exception

As there is no internal cache for enabled ppfeatures now. Thus the 2nd
parameter will be not needed any more.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a89ef044 08-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: avoid consecutive retrieving for enabled ppfeatures

As the enabled ppfeatures are just retrieved ahead. We can use
that directly instead of retrieving again and again.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3c6591e9 08-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: drop the cache for enabled ppfeatures

The following scenarios make the driver cache for enabled ppfeatures
outdated and invalid:
- Other tools interact with PMFW to change the enabled ppfeatures.
- PMFW may enable/disable some features behind driver's back. E.g.
for sienna_cichild, on gfxoff entering, PMFW will disable gfx
related DPM features. All those are performed without driver's
notice.
Also considering driver does not actually interact with PMFW such
frequently, the benefit brought by such cache is very limited.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2d282665 07-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: update the data type for retrieving enabled ppfeatures

Use uint64_t instead of an array of uint32_t. This can avoid
some non-necessary intermediate uint32_t -> uint64_t conversions.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5af779ad 07-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: unify the interface for retrieving enabled ppfeatures

Instead of having two which do the same thing.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bd425711 07-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: correct the way for retrieving enabled ppfeatures on Renoir

As other dGPU asics, Renoir should use smu_cmn_get_enabled_mask() for
that job.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1f2cf08a 28-Nov-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: drop unneeded feature->mutex

As all those related APIs are already well protected by adev->pm.mutex.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# da11407f 28-Nov-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: drop unneeded smu->metrics_lock

As all those related APIs are already well protected by
adev->pm.mutex and smu->message_lock.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7e31a858 12-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amdgpu: move smu_debug_mask to a more proper place

As the smu_context will be invisible from outside(of power). Also,
the smu_debug_mask can be shared around all power code instead of
some specific framework(swSMU) only.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6ff7fddb 29-Nov-2021 Lang Yu <lang.yu@amd.com>

drm/amdgpu: add support for SMU debug option

SMU firmware expects the driver maintains error context
and doesn't interact with SMU any more when SMU errors
occurred. That will aid in debugging SMU firmware issues.

Add SMU debug option support for this request, it can be
enabled or disabled via amdgpu_smu_debug debugfs file.
Use a 32-bit mask to indicate corresponding debug modes.
Currently, only one mode(HALT_ON_ERROR) is supported.
When enabled, it brings hardware to a kind of halt state
so that no one can touch it any more in the envent of SMU
errors.

The dirver interacts with SMU via sending messages. And
threre are three ways to sending messages to SMU in current
implementation. Handle them respectively as following:

1, smu_cmn_send_smc_msg_with_param() for normal timeout cases

Halt on any error.

2, smu_cmn_send_msg_without_waiting()/smu_cmn_wait_for_response()
for longer timeout cases

Halt on errors apart from ETIME. Otherwise this way won't work.
Let the user handle ETIME error in such a case.

3, smu_cmn_send_msg_without_waiting() for no waiting cases

Halt on errors apart from ETIME. Otherwise second way won't work.

== Command Guide ==

1, enable HALT_ON_ERROR mode

# echo 0x1 > /sys/kernel/debug/dri/0/amdgpu_smu_debug

2, disable HALT_ON_ERROR mode

# echo 0x0 > /sys/kernel/debug/dri/0/amdgpu_smu_debug

v5:
- Use bit mask to allow more debug features.(Evan)
- Use WRAN() instead of BUG().(Evan)

v4:
- Set to halt state instead of a simple hang.(Christian)

v3:
- Use debugfs_create_bool().(Christian)
- Put variable into smu_context struct.
- Don't resend command when timeout.

v2:
- Resend command when timeout.(Lijo)
- Use debugfs file instead of module parameter.

Signed-off-by: Lang Yu <lang.yu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bbe04dec 07-Dec-2021 Isabella Basso <isabbasso@riseup.net>

drm/amd: fix improper docstring syntax

This fixes various warnings relating to erroneous docstring syntax, of
which some are listed below:

warning: Function parameter or member 'adev' not described in
'amdgpu_atomfirmware_ras_rom_addr'
...
warning: expecting prototype for amdgpu_atpx_validate_functions().
Prototype was for amdgpu_atpx_validate() instead
...
warning: Excess function parameter 'mem' description in 'amdgpu_preempt_mgr_new'
...
warning: Cannot understand * @kfd_get_cu_occupancy - Collect number of
waves in-flight on this device
...
warning: This comment starts with '/**', but isn't a kernel-doc
comment. Refer Documentation/doc-guide/kernel-doc.rst

Signed-off-by: Isabella Basso <isabbasso@riseup.net>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e771d71d 17-Nov-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/amd/pm: Print the error on command submission

Print the error on command submission immediately after submitting to
the SMU. This is rate-limited. It helps to immediately know there was an
error on command submission, rather than leave it up to clients to report
the error, as sometimes they do not.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ca4b32bb 17-Nov-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/amd/pm: Add debug prints

Add prints where there are none and none are printed in the callee.

Remove the word "previous" from comment and print to make it shorter and
avoid confusion in various prints.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8bd1b7c2 10-Nov-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/amd/pm: Enhanced reporting also for a stuck command

Also print the message index and parameter of the stuck command.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 38a268b3 10-Nov-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/amd/pm: Enhanced reporting also for a stuck command

Also print the message index and parameter of the stuck command.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# be68d44b 08-Sep-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver

Current RUNPM mechanism relies on PMFW to master the timing for BACO
in/exit. And that needs cooperation from sound driver for dstate
change notification for function 1(audio). Otherwise(on sound driver
missing), BACO cannot be kicked in correctly and hang will be observed
on RUNPM exit.

By switching back to legacy message way on sound driver missing,
we are able to fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-and-tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8b514e89 08-Sep-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: fix runpm hang when amdgpu loaded prior to sound driver

Current RUNPM mechanism relies on PMFW to master the timing for BACO
in/exit. And that needs cooperation from sound driver for dstate
change notification for function 1(audio). Otherwise(on sound driver
missing), BACO cannot be kicked in correctly and hang will be observed
on RUNPM exit.

By switching back to legacy message way on sound driver missing,
we are able to fix the runpm hang observed for the scenario below:
amdgpu driver loaded -> runpm suspend kicked -> sound driver loaded

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reported-and-tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 828db598 30-Jun-2021 Darren Powell <darren.powell@amd.com>

amdgpu/pm: Replace navi10 usage of sprintf with sysfs_emit

initial modification of files
smu_cmn.c
navi10_ppt.c

=== Test ===
AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | awk '{print $9}'`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}
LOGFILE=pp_printf.test.log

lspci -nn | grep "VGA\|Display" > $LOGFILE
FILES="pp_dpm_sclk
pp_sclk_od
pp_mclk_od
pp_dpm_pcie
pp_od_clk_voltage
pp_power_profile_mode "

for f in $FILES
do
echo === $f === >> $LOGFILE
cat $HWMON_DIR/device/$f >> $LOGFILE
done
cat $LOGFILE

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 544dcd74 28-Jul-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/amd/pm: Fix a bug in semaphore double-lock

Fix a bug in smu_cmn_send_msg_without_waiting() in
that this function does not need to take the
smu->message_lock mutex in order to send a message
down to the SMU. The mutex is acquired by the
caller of this function instead.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Changfeng Zhu <Changfeng.Zhu@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Fixes: 5810323ba69289 ("drm/amd/pm: Fix a bug communicating with the SMU (v5)")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5810323b 09-Jul-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/amd/pm: Fix a bug communicating with the SMU (v5)

This fixes a bug which if we probe a non-existing
I2C device, and the SMU returns 0xFF, from then on
we can never communicate with the SMU, because the
code before this patch reads and interprets 0xFF
as a terminal error, and thus we never write 0
into register 90 to clear the status (and
subsequently send a new command to the SMU.)

It is not an error that the SMU returns status
0xFF. This means that the SMU executed the last
command successfully (execution status), but the
command result is an error of some sort (execution
result), depending on what the command was.

When doing a status check of the SMU, before we
send a new command, the only status which
precludes us from sending a new command is 0--the
SMU hasn't finished executing a previous command,
and 0xFC--the SMU is busy.

This bug was seen as the following line in the
kernel log,

amdgpu: Msg issuing pre-check failed(0xff) and SMU may be not in the right state!

when subsequent SMU commands, not necessarily
related to I2C, were sent to the SMU.

This patch fixes this bug.

v2: Add a comment to the description of
__smu_cmn_poll_stat() to explain why we're NOT
defining the SMU FW return codes as macros, but
are instead hard-coding them. Such a change, can
be followed up by a subsequent patch.

v3: The changes are,
a) Add comments to break labels in
__smu_cmn_reg2errno().

b) When an unknown/unspecified/undefined result is
returned back from the SMU, map that to
-EREMOTEIO, to distinguish failure at the SMU
FW.

c) Add kernel-doc to
smu_cmn_send_msg_without_waiting(),
smu_cmn_wait_for_response(),
smu_cmn_send_smc_msg_with_param().

d) In smu_cmn_send_smc_msg_with_param(), since we
wait for completion of the command, if the
result of the completion is
undefined/unknown/unspecified, we print that to
the kernel log.

v4: a) Add macros as requested, though redundant, to
be removed when SMU consolidates for all
ASICs--see comment in code.
b) Get out if the SMU code is unknown.

v5: Rename the macro names.

Cc: Alex Deucher <Alexander.Deucher@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Lijo Lazar <Lijo.Lazar@amd.com>
Fixes: fcb1fe9c9e0031 ("drm/amd/powerplay: pre-check the SMU state before issuing message")
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1e75be2b 07-Dec-2020 Evan Quan <evan.quan@amd.com>

drm/amd/pm: update the cached dpm feature status

For some ASICs, the real dpm feature disablement job is handled by
PMFW during baco reset and custom pptable loading. Cached dpm feature
status need to be updated to pair that.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c23083cd 07-Jun-2021 Graham Sider <Graham.Sider@amd.com>

drm/amd/pm: Add common throttler translation func

Defines smu_cmn_get_indep_throttler_status which performs ASIC
independent translation given a corresponding lookup table.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 22a7dcf5 03-Jun-2021 Graham Sider <Graham.Sider@amd.com>

drm/amd/pm: Add u64 throttler status field to gpu_metrics

This patch set adds support for a new ASIC independant u64 throttler
status field (indep_throttle_status). Piggybacks off the
gpu_metrics_v1_3 bump and similarly bumps gpu_metrics_v2 version (to
v2_2) to add field.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7afefb81 21-May-2021 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Rename flag which prevents HW access

Make it's name not feature but function descriptive.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210521204122.762288-1-andrey.grodzovsky@amd.com


# 0b7db431 14-May-2021 David M Nieto <david.nieto@amd.com>

drm/amdgpu/pm: Update metrics table (v2)

v2: removed static dpm and frequency ranges from table

expand metrics table with voltages and frequency ranges

Signed-off-by: David M Nieto <david.nieto@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cfd053be 26-Apr-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: expose pmfw attached timestamp on Aldebaran

Available with 68.18.0 and later PMFWs.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1e4a53de 09-Apr-2021 Darren Powell <darren.powell@amd.com>

amdgpu/pm: add extra info to SMU msg pre-check failed message

Insert the value of the response to error message emitted when the
SMU msg pre-check failes

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 05516264 07-Apr-2021 charles sun <charles.sun@amd.com>

drm/amd/pm: increase time out value when sending msg to SMU

when do S3 stress, low rate that PowerUpVcn message will get response
more than 1s, so here increase the timeout to 2s

Signed-off-by: charles sun <charles.sun@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <Alexander.Deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e42569d0 16-Mar-2021 Lijo Lazar <lijo.lazar@amd.com>

drm/amd/pm: Modify mode2 msg sequence on aldebaran

v1: During mode2 reset, PCI space is lost after message is sent.
Restore PCI space before waiting for response from firmware.

v2: Move mode2 sequence to aldebaran and update PMFW version.
Handle generic sequence in smu13 without PMFW version check.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1689fca0 16-Mar-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: fix Navi1x runtime resume failure V2

The RLC was put into a wrong state on runtime suspend. Thus the RLC
autoload will fail on the succeeding runtime resume. By adding an
intermediate PPSMC_MSG_PrepareMp1ForUnload(some GC hard reset involved,
designed for PnP), we can bring RLC back into the desired state.

V2: integrate INTERRUPTS_ENABLED flag clearing into current
mp1 state set routines

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4215a119 25-Feb-2021 Horace Chen <horace.chen@amd.com>

drm/amdgpu: enable one vf mode on sienna cichlid vf

sienna cichlid needs one vf mode which allows vf to set and get
clock status from guest vm. So now expose the required interface
and allow some smu request on VF mode. Also since this asic blocked
direct MMIO access, use KIQ to send SMU request under sriov vf.

OD use same command as getting pp table which is not allowed for
sienna cichlid, so remove OD feature under sriov vf.

Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Monk Liu<monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 152bb95c 28-Feb-2021 Evan Quan <evan.quan@amd.com>

drm/amd/pm: update existing gpu_metrics interfaces V2

Update the gpu_metrics interface implementations to use the latest
upgraded data structures.

V2: fit the data type change of energy_accumulator

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8addf37c 18-Feb-2021 Nathan Chancellor <nathan@kernel.org>

drm/amd/pm/swsmu: Avoid using structure_size uninitialized in smu_cmn_init_soft_gpu_metrics

Clang warns:

drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:764:2: warning:
variable 'structure_size' is used uninitialized whenever switch default
is taken [-Wsometimes-uninitialized]
default:
^~~~~~~
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:770:23: note:
uninitialized use occurs here
memset(header, 0xFF, structure_size);
^~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:753:25: note:
initialize the variable 'structure_size' to silence this warning
uint16_t structure_size;
^
= 0
1 warning generated.

Return in the default case, as the size of the header will not be known.

Fixes: de4b7cd8cb87 ("drm/amd/pm/swsmu: unify the init soft gpu metrics function")
Link: https://github.com/ClangBuiltLinux/linux/issues/1304
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7b3d19a7 18-Feb-2021 Nathan Chancellor <nathan@kernel.org>

drm/amd/pm/swsmu: Avoid using structure_size uninitialized in smu_cmn_init_soft_gpu_metrics

Clang warns:

drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:764:2: warning:
variable 'structure_size' is used uninitialized whenever switch default
is taken [-Wsometimes-uninitialized]
default:
^~~~~~~
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:770:23: note:
uninitialized use occurs here
memset(header, 0xFF, structure_size);
^~~~~~~~~~~~~~
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu_cmn.c:753:25: note:
initialize the variable 'structure_size' to silence this warning
uint16_t structure_size;
^
= 0
1 warning generated.

Return in the default case, as the size of the header will not be known.

Fixes: de4b7cd8cb87 ("drm/amd/pm/swsmu: unify the init soft gpu metrics function")
Link: https://github.com/ClangBuiltLinux/linux/issues/1304
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# de4b7cd8 03-Feb-2021 Kevin Wang <kevin1.wang@amd.com>

drm/amd/pm/swsmu: unify the init soft gpu metrics function

the soft gpu metrics is not asic related data structure.
unify them to reduce duplicate code.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e484de44 20-Jan-2021 Huang Rui <ray.huang@amd.com>

drm/amd/pm: print the timeout of smc message

This patch is to help firmware designer to know the smc message timeout
status.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 96673790 12-Jan-2021 Huang Rui <ray.huang@amd.com>

drm/amd/pm: fix the return value of pm message

0 should be right driver return value, 0x1 is the right firmware
return value. So switch to 0 at last.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Xiaojian Du <xiaojian.du@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 74353883 02-Dec-2020 Huang Rui <ray.huang@amd.com>

drm/amdgpu: revise the mode2 reset for vangogh

PCIE MMIO bar needs to be restored firstly after the reset event
triggers. So it's unable to access the registers to wait for response
from SMU. Becasue the value of mmMP1_SMN_C2PMSG_90 is invalid at that
moment.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 47381540 07-Jan-2021 Huang Rui <ray.huang@amd.com>

drm/amd/pm: don't mark all apu as true on feature mask

VHG based APU will support feature mask checking.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 11db224b 08-Jan-2021 Huang Rui <ray.huang@amd.com>

drm/amd/pm: enhance the real response for smu message (v2)

The user prefers to know the real response value from C2PMSG 90 register
which is written by firmware not -EIO.

v2: return C2PMSG 90 value

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0339258b 26-Nov-2020 Evan Quan <evan.quan@amd.com>

drm/amd/pm: invalidate hdp before CPU access the memory written by GPU

To eliminate the possible influence by outdated HDP read cache.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ac70c6c6 26-Oct-2020 Xiaojian Du <Xiaojian.Du@amd.com>

drm/amd/pm: add one new function to get 32 bit feature mask for vangogh

This patch is to add one new function to get 32 bit feature mask for
vangogh.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a6c42e84 16-Oct-2020 Kevin Wang <kevin1.wang@amd.com>

drm/amd/swsmu: correct wrong feature bit mapping

1. when smc feature bit isn't mapped,
the feature state isn't showed on sysfs node of pp_features.
2. add pp_features table title

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7aeef2aa 16-Oct-2020 Kevin Wang <kevin1.wang@amd.com>

drm/amd/swsmu: correct wrong feature bit mapping

1. when smc feature bit isn't mapped,
the feature state isn't showed on sysfs node of pp_features.
2. add pp_features table title

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 10144762 31-Aug-2020 Evan Quan <evan.quan@amd.com>

drm/amd/pm: postpone SOCCLK/UCLK enablement after DAL initialization(V2)

This is needed for Navi1X only. And it may help for display missing
or hang issue seen on some high resolution monitors.

V2: no UCLK DPM enablement for Navi10 A0 secure SKU

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bf36b52e 29-Jul-2020 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Avoid accessing HW when suspending SW state

At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.

v2: Rename in_dpc to more meaningful name

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e098bc96 13-Aug-2020 Evan Quan <evan.quan@amd.com>

drm/amd/pm: optimize the power related source code layout

The target is to provide a clear entry point(for power routines).
Also this can help to maintain a clear view about the frameworks
used on different ASICs. Hopefully all these can make power part
more friendly to play with.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>