#
ed1e1e42 |
|
23-Jan-2024 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: Support passing poison consumption ras block to SRIOV Support passing poison consumption ras blocks to SRIOV. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
afb617f3 |
|
14-Jan-2024 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: add interface to check mca umc status Add interface to check mca umc status. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
6c23f3d1 |
|
14-Jan-2024 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: Use asynchronous polling to handle umc_v12_0 poisoning Use asynchronous polling to handle umc_v12_0 poisoning. v2: 1. Change function name. 2. Change the debugging information content. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
40a08fe8 |
|
17-May-2023 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add address conversion for UMC v12 Convert MCA error address to physical address and find out all pages in one physical row. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
b573cf88 |
|
30-May-2023 |
Stanley.Yang <Stanley.Yang@amd.com> |
drm/amdgpu: Support setting EEPROM table version Add setting EEPROM table version interface for umcv8.10, Add EEPROM table v2.1 to UMC v8.10. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
e86bd8b2 |
|
27-Mar-2023 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: optimize redundant code in umc_v8_10 Optimize redundant code in umc_v8_10 Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
a6dcf9a7 |
|
11-Mar-2023 |
Hawking Zhang <Hawking.Zhang@amd.com> |
drm/amdgpu: Move umc ras block init to gmc ras sw_init Initialize umc ras block only when umc ip block supports ras. Driver queries ras capabilities after early_init, ras block init needs to be moved to sw_init. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
2b595659 |
|
24-Feb-2023 |
Candice Li <candice.li@amd.com> |
drm/amdgpu: Support umc node harvest config on umc v8_10 Don't need to query error count and error address on harvest umc nodes. v2: Fix code bug, use active_mask instead of harvsest_config and remove unnecessary argument in LOOP macro. v3: Leave adev->gmc.num_umc unchanged. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
06630fb9 |
|
24-Feb-2023 |
Candice Li <candice.li@amd.com> |
drm/amdgpu: Support umc node harvest config on umc v8_10 Don't need to query error count and error address on harvest umc nodes. v2: Fix code bug, use active_mask instead of harvsest_config and remove unnecessary argument in LOOP macro. v3: Leave adev->gmc.num_umc unchanged. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
e69c7857 |
|
16-Feb-2023 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add umc retire unit element It records how many bad pages are retired in one uncorrectable error. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
1ed0e176 |
|
17-Oct-2022 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: remove ras_error_status parameter for UMC poison handler Make the code simpler. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
cbe4d43e |
|
17-Oct-2022 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add RAS page retirement functions for MCA Define page retirement functions for MCA platform. v2: remove page retirement handling from MCA poison handler, let MCA notifier do page retirement. v3: remove specific poison handler for MCA to simplify code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
6c0ca748 |
|
14-Oct-2022 |
Hawking Zhang <Hawking.Zhang@amd.com> |
drm/amdgpu: move convert_error_address out of umc_ras RAS error address translation algorithm is common across dGPU and A + A platform as along as the SOC integrates the same generation of UMC IP. UMC RAS is managed by x86 MCA on A + A platform, umc_ras in GPU driver is not initialized at all on A + A platform. In such case, any umc_ras callback implemented for dGPU config shouldn't be invoked from A + A specific callback. The change moves convert_error_address out of dGPU umc_ras structure and makes it share between A + A and dGPU config. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Stanley Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
44420ac5 |
|
26-Sep-2022 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: define RAS convert_error_address API Make the code reusable and remove redundant code. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
c19a5f32 |
|
21-Sep-2022 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: export umc error address convert interface Make it global so we can convert specific mca address. v2: rename query_error_address_per_channel to convert_ras_error_address Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
e4b1edf4 |
|
04-Jul-2022 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: add umc ras functions for umc v8_10_0 1. Support query umc ras error counter. 2. Support ras umc ue error address remapping. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
0dca257d |
|
14-Feb-2022 |
yipechai <YiPeng.Chai@amd.com> |
drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
01d468d9 |
|
16-Feb-2022 |
yipechai <YiPeng.Chai@amd.com> |
drm/amdgpu: Modify .ras_fini function pointer parameter Modify .ras_fini function pointer parameter so that we can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
4e9b1fa5 |
|
13-Feb-2022 |
yipechai <YiPeng.Chai@amd.com> |
drm/amdgpu: Modify .ras_late_init function pointer parameter Modify .ras_late_init function pointer parameter so that it can remove redundant intermediate calls in some ras blocks. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
a3ace75c |
|
07-Feb-2022 |
yipechai <YiPeng.Chai@amd.com> |
drm/amdgpu: Optimize amdgpu_umc_ras_late_init/amdgpu_umc_ras_fini function code Optimize amdgpu_umc_ras_late_init/amdgpu_umc_ras_fini function code. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
400013b2 |
|
19-Jan-2022 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add umc_fill_error_record to make code more simple Create common amdgpu_umc_fill_error_record function for all versions of UMC and clean up related codes. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
efe17d5a |
|
05-Jan-2022 |
yipechai <YiPeng.Chai@amd.com> |
drm/amdgpu: Modify umc block to fit for the unified ras block data and ops 1.Modify umc block to fit for the unified ras block data and ops. 2.Change amdgpu_umc_ras_funcs to amdgpu_umc_ras, and the corresponding variable name remove _funcs suffix. 3.Remove the const flag of umc ras variable so that umc ras block can be able to be inserted into amdgpu device ras block link list. 4.Invoke amdgpu_ras_register_ras_block function to register umc ras block into amdgpu device ras block link list. 5.Remove the redundant code about umc in amdgpu_ras.c after using the unified ras block. 6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of umc versions. If .ras_late_init and .ras_fini had been defined by the selected umc version, the defined functions will take effect; if not defined, default fill them with amdgpu_umc_ras_late_init and amdgpu_umc_ras_fini. Signed-off-by: yipechai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: John Clements <john.clements@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
fec8c524 |
|
20-Dec-2021 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: save error count in RAS poison handler Otherwise the RAS error count couldn't be queried from sysfs. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
f4409ee8 |
|
10-Dec-2021 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add gpu reset control for umc page retirement Add a reset parameter for umc page retirement, let user decide whether call gpu reset in umc page retirement. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
8882f90a |
|
16-Nov-2021 |
Stanley.Yang <Stanley.Yang@amd.com> |
drm/amdgpu: add new query interface for umc block v2 add message smu to query error information v2: rename message_smu to ecc_info Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
aaca8c38 |
|
17-Sep-2021 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add poison mode query for UMC Add ras poison mode query interface for UMC. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
6ec598cc |
|
16-Jun-2021 |
Stanley.Yang <Stanley.Yang@amd.com> |
drm/amdgpu: fix bad address translation for sienna_cichlid Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
49070c4e |
|
17-Mar-2021 |
Hawking Zhang <Hawking.Zhang@amd.com> |
drm/amdgpu: split umc callbacks to ras and non-ras ones umc ras is not managed by gpu driver when gpu is connected to cpu through xgmi. split umc callbacks into ras and non-ras ones so gpu driver only initializes umc ras callbacks when it manages umc ras. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
c5a4ef3e |
|
23-Jul-2020 |
John Clements <john.clements@amd.com> |
drm/amdgpu: move umc specific macros to header certain umc macros are common across umc versions Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
bd68fb94 |
|
02-Jan-2020 |
John Clements <john.clements@amd.com> |
drm/amdgpu: resolve bug in UMC 6 error counter query iterate over all error counter registers in SMN space removed support error counter access via MMIO Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
181c93e5 |
|
18-Sep-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: move umc ras fini to umc block it's more suitable to put umc ras fini in umc block Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
56c54b25 |
|
12-Sep-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: remove ih_info parameter of umc_ras_late_init umc_ras_late_init can get the info by itself Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
03740baa |
|
12-Sep-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: move umc_ras_if from gmc to umc block umc_ras_if is relevant to umc Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
34cc4fd9 |
|
11-Sep-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: move umc ras irq functions to umc block move umc ras irq functions from gmc v9 to generic umc block, these functions are relevant to umc and they can be shared among all generations of umc Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
e7da754b |
|
24-Sep-2019 |
Monk Liu <Monk.Liu@amd.com> |
drm/amdgpu: fix an UMC hw arbitrator bug(v3) issue: the UMC6 h/w bug is that when MCLK is doing the switch in the middle of a page access being preempted by high priority client (e.g. DISPLAY) then UMC and the mclk switch would stuck there due to deadlock how: fixed by disabling auto PreChg for UMC to avoid high priority client preempting other client's access on the same page, thus the deadlock could be avoided v2: put the patch in callback of UMC6 v3: rename the callback to "init_registers" Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <hawking.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
d99659a0 |
|
06-Sep-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: rename umc ras_init to err_cnt_init this interface is related to specific version of umc, distinguish it from ras_late_init Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
86edcc7d |
|
05-Sep-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: move umc late init from gmc to umc block umc late init is umc specific, it's more suitable to be put in umc block Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
dd21a572 |
|
09-Aug-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: implement UMC 64 bits REG operations implement 64 bits operations via 32 bits interface v2: make use of lower_32_bits() and upper_32_bits() macros Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
fee858ba |
|
29-Jul-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add macro of umc for each channel common function for all umc versions, loop for each umc channel is a frequent used operation in umc block, define it as a macro to simplify code Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
33b97cf8 |
|
29-Jul-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add more parameters and functions to amdgpu_umc structure expose more parameters and functions of specific umc version to common umc layer, so amdgpu_umc layer and other blocks could access them Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
c2742aef |
|
22-Jul-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add structures for umc error address translation add related registers, callback function and channel index table Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
045c0216 |
|
22-Jul-2019 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: switch to amdgpu_umc structure create new amdgpu_umc structure to for more umc settings in future and switch to the new structure Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Dennis Li <dennis.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
9e585a52 |
|
23-Jul-2019 |
Hawking Zhang <Hawking.Zhang@amd.com> |
drm/amdgpu: add amdgpu_umc_functions structure This is common structure as UMC callback function Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Dennis Li <dennis.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|