#
2c684b93 |
|
28-Feb-2024 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add deferred error check for UMC v12 address query Both RAS UE and deferred errors need page retirement. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
01087a19 |
|
26-Jan-2024 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: use PSP address query command Get UMC physical address from PSP in RAS error address coversion. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
0795b5d2 |
|
14-Jan-2024 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu:Support retiring multiple MCA error address pages Support retiring multiple MCA error address pages in one in-band query for umc v12_0. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
afb617f3 |
|
14-Jan-2024 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: add interface to check mca umc status Add interface to check mca umc status. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
22f6e3e1 |
|
14-Jan-2024 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: Add log info for umc_v12_0 Add log info for umc_v12_0. v2: Delete redundant logs. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
a9e4f61d |
|
17-Jan-2024 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: update error condition check for umc_v12_0_query_error_address Deferred error is also taken into account. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
46e2231c |
|
03-Jan-2024 |
Candice Li <candice.li@amd.com> |
drm/amdgpu: Log deferred error separately Separate deferred error from UE and CE and log it individually. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
f38765de |
|
13-Nov-2023 |
Yang Wang <kevinyang.wang@amd.com> |
drm/amdgpu: add umc v12.0 ACA support add umc v12.0 ACA driver support Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
99cab331 |
|
12-Dec-2023 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: Add umc page retirement for umc v12_0 Add umc page retirement for umc v12_0. V2: 1. Changed umc page retirement check condition to call umc_v12_0_is_uncorrectable_error. 2. Use memset to clear the contents of the umc error address structure. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
a8c77a12 |
|
13-Dec-2023 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: Add poison mode check error condition for umc v12_0 Add poison mode check error condition for umc v12_0. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
9f91e983 |
|
12-Dec-2023 |
YiPeng Chai <YiPeng.Chai@amd.com> |
drm/amdgpu: MCA supports recording umc address information MCA supports recording umc address information. V2: Move err_addr variable from struct ras_err_node to struct ras_err_info. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
bf13da6a |
|
24-Oct-2023 |
Yang Wang <kevinyang.wang@amd.com> |
drm/amdgpu: correct smu v13.0.6 umc ras error check correct smu v13.0.0 umc ras error check Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
e020d015 |
|
27-Oct-2023 |
Candice Li <candice.li@amd.com> |
drm/amdgpu: Drop deferred error in uncorrectable error check Drop checking deferred error which can be handled by poison consumption. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
d59fcfb0 |
|
25-Oct-2023 |
Candice Li <candice.li@amd.com> |
drm/amdgpu: Identify data parity error corrected in replay mode Use ErrorCodeExt field to identify data parity error in replay mode. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
afcf949c |
|
18-Oct-2023 |
Candice Li <candice.li@amd.com> |
drm/amdgpu: Log UE corrected by replay as correctable error Support replay mode where UE could be converted to CE. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
3bba4bc6 |
|
26-Sep-2023 |
Yang Wang <kevinyang.wang@amd.com> |
drm/amdgpu: add RAS error info support for umc_v12_0 add RAS error info support for umc_v12_0. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
f8754f58 |
|
19-Sep-2023 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: print channel index for UMC bad page Print channel index for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
ced57520 |
|
06-Sep-2023 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: print more address info of UMC bad page Print out row, column and bank value of UMC error address for UMC v12. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
3cb9ebc9 |
|
26-Jul-2023 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add channel index table for UMC v12 Get UMC phyical channel index according to node id, umc instance and channel instance. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
40a08fe8 |
|
17-May-2023 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdgpu: add address conversion for UMC v12 Convert MCA error address to physical address and find out all pages in one physical row. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
7e6ec099 |
|
10-May-2023 |
Candice Li <candice.li@amd.com> |
drm/amdgpu: Add umc v12_0 ras functions Add umc v12_0 ras error querying. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|