History log of /linux-master/drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c
Revision Date Author Comments
# b8f67b9d 18-Jan-2024 Shashank Sharma <shashank.sharma@amd.com>

drm/amdgpu: change vm->task_info handling

This patch changes the handling and lifecycle of vm->task_info object.
The major changes are:
- vm->task_info is a dynamically allocated ptr now, and its uasge is
reference counted.
- introducing two new helper funcs for task_info lifecycle management
- amdgpu_vm_get_task_info: reference counts up task_info before
returning this info
- amdgpu_vm_put_task_info: reference counts down task_info
- last put to task_info() frees task_info from the vm.

This patch also does logistical changes required for existing usage
of vm->task_info.

V2: Do not block all the prints when task_info not found (Felix)

V3: Fixed review comments from Felix
- Fix wrong indentation
- No debug message for -ENOMEM
- Add NULL check for task_info
- Do not duplicate the debug messages (ti vs no ti)
- Get first reference of task_info in vm_init(), put last
in vm_fini()

V4: Fixed review comments from Felix
- fix double reference increment in create_task_info
- change amdgpu_vm_get_task_info_pasid
- additional changes in amdgpu_gem.c while porting

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 46e5de77 03-Jan-2024 Yifan Zhang <yifan1.zhang@amd.com>

drm/amdgpu: add GFXHUB 11.5.1 support

This patch to add GFXHUB 11.5.1 support.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 31e0a586 03-Jan-2024 Yifan Zhang <yifan1.zhang@amd.com>

drm/amdgpu: add MMHUB 3.3.1 support

This patch to add MMHUB 3.3.1 support.

v2: squash in fault info fix (Alex)

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 26405ff4 13-Dec-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move kiq_reg_write_reg_wait() out of amdgpu_virt.c

It's used for more than just SR-IOV now, so move it to
amdgpu_gmc.c and rename it to better match the functionality and
update the comments in the code paths to better document
when each path is used and why. No functional change.

Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Shaoyun.Liu@amd.com
Cc: Christian.Koenig@amd.com


# a32c6f7f 15-Dec-2023 Stanley.Yang <Stanley.Yang@amd.com>

drm/amdgpu: Fix ecc irq enable/disable unpaired

The ecc_irq is disabled while GPU mode2 reset suspending process,
but not be enabled during GPU mode2 reset resume process.

Changed from V1:
only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip
delete amdgpu_ras_late_resume function

Changed from V2:
check umc ras supported before put ecc_irq

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0db062ea 09-Nov-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: disable AGP aperture

We've had misc reports of random IOMMU page faults when
this is used. It's just a rarely used optimization anyway, so
let's just disable it. It can still be toggled via the
module parameter for testing.

v2: leave it configurable via module parameter

Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11")
Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1)
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6ba5b613 09-Nov-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add a module parameter to control the AGP aperture

Add a module parameter to control the AGP aperture. The AGP
aperture is an aperture in the GPU's internal address space
which provides direct non-paged access to the platform address
space. This access is non-snooped so only uncached memory
can be accessed.

Add a knob so that we can toggle this for debugging.

Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 564ca1b5 14-Nov-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: fix logic typo in AGP check

Should be && rather than ||.

Fixes: b2e1cbe6281f ("drm/amdgpu/gmc11: disable AGP on GC 11.5")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bc3c5660 08-Aug-2023 Victor Lu <victorchengchi.lu@amd.com>

drm/amdgpu: Add xcc param to SRIOV kiq write and WREG32_SOC15_IP_NO_KIQ (v4)

WREG32/RREG32_SOC15_IP_NO_KIQ and amdgpu_virt_kiq_reg_write_reg_wait
are not using the correct rlcg interface or mec engine, respectively.

Add xcc instance parameter to them.

v4: Use GET_INST and squash commit with:
"drm/amdgpu: Add xcc_inst param to amdgpu_virt_kiq_reg_write_reg_wait"

v3: xcc not needed for MMMHUB

v2: rebase

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bff3315b 07-Nov-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix AGP init order

The default AGP settings were overwriting the IP selected
ones since the default was getting set after the IP ones
were selected.

Fixes: de59b69932e6 ("drm/amdgpu/gmc: set a default disable value for AGP")
Link: https://lists.freedesktop.org/archives/amd-gfx/2023-November/100966.html
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>


# 3ea8dd37 23-Oct-2023 Kenneth Feng <kenneth.feng@amd.com>

drm/amd/amdgpu: avoid to disable gfxhub interrupt when driver is unloaded

avoid to disable gfxhub interrupt when driver is unloaded on gmc 11

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6a1c31c7 12-Oct-2023 Yifan Zhang <yifan1.zhang@amd.com>

drm/amdgpu: flush the correct vmid tlb for specific pasid

flush the correct vmid tlb for specific pasid on gmc 11.

Fixes: 041a5743883d ("drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid")
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8dbf1ba8 29-Sep-2022 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: cache gpuvm fault information for gmc7+

Cache the current fault info in the vm struct. This can be queried
by userspace later to help debug UMDs.

Cc: samuel.pitoiset@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 67318cb8 14-Sep-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: set gart placement GC11

Needed to avoid a hardware issue.

v2: force high for all GC11 parts for consistency (Alex)
v3: rebase

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 917f91d8 14-Sep-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc: add a way to force a particular placement for GART

We normally place GART based on the location of VRAM and the
available address space around that, but provide an option
to force a particular location for hardware that needs it.

v2: Switch to passing the placement via parameter

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b2e1cbe6 21-Sep-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: disable AGP on GC 11.5

AGP aperture is deprecated and no longer functional.

v2: fix typo (Alex)
v3: just skip the agp setup call
v4: revert back to the original model
v5: back to v3

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# de59b699 20-Sep-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc: set a default disable value for AGP

To disable AGP, the start needs to be set to a higher
value than the end. Set a default disable value for
the AGP aperture and allow the IP specific GMC code
to enable it selectively be calling amdgpu_gmc_agp_location().

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3983c9fd 04-Sep-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: drop error return from flush_gpu_tlb_pasid

That function never fails, drop the error return.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 041a5743 04-Sep-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: fix and cleanup gmc_v11_0_flush_gpu_tlb_pasid

The same PASID can be used by more than one VMID, reset each of them.

Use the common KIQ handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a54db42f 01-Sep-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup gmc_v11_0_flush_gpu_tlb

Remove leftovers from copying this from the gmc v10 code.

v2: squash in fix from Yifan

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a70cb217 01-Sep-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: rework gmc_v10_0_flush_gpu_tlb v2

Move the SDMA workaround necessary for Navi 1x into a higher layer.

v2: use dev_err

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5f248462 21-Jul-2023 David Francis <David.Francis@amd.com>

drm/amdgpu: Add EXT_COHERENT memory allocation flags

These flags (for GEM and SVM allocations) allocate
memory that allows for system-scope atomic semantics.

On GFX943 these flags cause caches to be avoided on
non-local memory.

On all other ASICs they are identical in functionality to the
equivalent COHERENT flags.

Corresponding Thunk patch is at
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88

Reviewed-by: David Yat Sin <David.YatSin@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4e8303cf 11-Sep-2023 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Use function for IP version check

Use an inline function for version check. Gives more flexibility to
handle any format changes.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0bdf09cc 24-Aug-2023 Yifan Zhang <yifan1.zhang@amd.com>

drm/amdgpu: calling address translation functions to simplify codes

Use amdgpu_gmc_vram_pa to simplify codes.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 96271dd4 10-May-2023 benl <ben.li@amd.com>

drm/amdgpu: add gfxhub 11.5.0 support

Add initial gfxhub 11.5 support.

Signed-off-by: benl <ben.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# aba2be41 04-May-2023 Lang Yu <Lang.Yu@amd.com>

drm/amdgpu: add mmhub 3.3.0 support

Add initial implementation for mmhub 3.3.0.

v2: squash in client id fix (Alex)

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# dd5a3261 14-Jul-2023 Prike Liang <Prike.Liang@amd.com>

drm/amdgpu/gmc11: initialize GMC for GC 11.5.0 memory support

Initialize vram attribute and VMHUB for GC 11.5.0.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6f38bdb8 26-Jul-2023 Lang Yu <Lang.Yu@amd.com>

drm/amdgpu: correct vmhub index in GMC v10/11

Align with new vmhub definition.

v2: use client_id == VMC to decide vmhub(Hawking)

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 38d47145 30-Jun-2023 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Fix warnings in gmc_v11_0.c

Fix below checkpatch warnings:

WARNING: quoted string split across lines
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'
WARNING: void function return statements are not generally useful
WARNING: braces {} are not necessary for any arm of this statement

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e2710187 04-Jul-2023 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Prefer dev_warn over printk

Fix the below warning:

WARNING: Prefer [subsystem eg: netdev]_warn([subsystem]dev, ... then
dev_warn(dev, ... then pr_warn(... to printk(KERN_WARNING ...

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e77673d1 09-Jun-2023 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Update invalid PTE flag setting

Update the invalid PTE flag setting with TF enabled.
This is to ensure, in addition to transitioning the
retry fault to a no-retry fault, it also causes the
wavefront to enter the trap handler. With the current
setting, the fault only transitions to a no-retry fault.
Additionally, have 2 sets of invalid PTE settings, one for
TF enabled, the other for TF disabled. The setting with
TF disabled, doesn't work with TF enabled.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1bae03aa 30-May-2023 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Fix up missing parameter in kdoc for 'inst' in gmc_ v7, v8, v9, v10, v11.c

Fix these warnings by adding 'inst' arguments to kdocs.

gcc with W=1
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:428: warning: Function parameter or member 'inst' not described in 'gmc_v7_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:626: warning: Function parameter or member 'inst' not described in 'gmc_v8_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:423: warning: Function parameter or member 'inst' not described in 'gmc_v10_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:328: warning: Function parameter or member 'inst' not described in 'gmc_v11_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:950: warning: Function parameter or member 'inst' not described in 'gmc_v9_0_flush_gpu_tlb_pasid'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f87f6864 09-May-2022 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Add XCC inst to PASID TLB flushing

Add XCC instance to select the correct KIQ ring when
flushing TLBs on a multi-XCC setup.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d9426c3d 20-Dec-2021 Le Ma <le.ma@amd.com>

drm/amdgpu: add bitmask to iterate vmhubs

As the layout of VMHUB definition has been changed to cover multiple
XCD/AID case, the original num_vmhubs is not appropriate to do vmhub
iteration any more.

Drop num_vmhubs and introduce vmhubs_mask instead.

v2: switch to the new VMHUB layout
v3: use DECLARE_BITMAP to define vmhubs_mask

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f4caf584 14-Sep-2022 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: introduce vmhub definition for multi-partition cases (v3)

v1: Each partition has its own gfxhub or mmhub. adjust
the num of MAX_VMHUBS and the GFXHUB/MMHUB layout (Le)

v2: re-design the AMDGPU_GFXHUB/AMDGPU_MMHUB layout (Le)

v3: apply the gfxhub/mmhub layout to new IPs (Hawking)

v4: fix up gmc11 (Alex)

v5: rebase (Alex)

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 17d62410 11-May-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: implement get_vbios_fb_size()

Implement get_vbios_fb_size() so we can properly reserve
the vbios splash screen to avoid potential artifacts on the
screen during the transition from the pre-OS console to the
OS console.

Acked-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7e5b6010 24-Apr-2023 Horatio Zhang <Hongkun.Zhang@amd.com>

drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini

The gmc.ecc_irq is enabled by firmware per IFWI setting,
and the host driver is not privileged to enable/disable
the interrupt. So, it is meaningless to use the amdgpu_irq_put
function in gmc_v11_0_hw_fini, which also leads to the call
trace.

[ 102.980303] Call Trace:
[ 102.980303] <TASK>
[ 102.980304] gmc_v11_0_hw_fini+0x54/0x90 [amdgpu]
[ 102.980357] gmc_v11_0_suspend+0xe/0x20 [amdgpu]
[ 102.980409] amdgpu_device_ip_suspend_phase2+0x240/0x460 [amdgpu]
[ 102.980459] amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu]
[ 102.980520] amdgpu_device_pre_asic_reset+0xd9/0x490 [amdgpu]
[ 102.980573] amdgpu_device_gpu_recover.cold+0x548/0xce6 [amdgpu]
[ 102.980687] amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu]
[ 102.980740] process_one_work+0x21f/0x3f0
[ 102.980741] worker_thread+0x200/0x3e0
[ 102.980742] ? process_one_work+0x3f0/0x3f0
[ 102.980743] kthread+0xfd/0x130
[ 102.980743] ? kthread_complete_and_exit+0x20/0x20
[ 102.980744] ret_from_fork+0x22/0x30

Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 68518294 11-May-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: implement get_vbios_fb_size()

Implement get_vbios_fb_size() so we can properly reserve
the vbios splash screen to avoid potential artifacts on the
screen during the transition from the pre-OS console to the
OS console.

Acked-by: Sunil Khatri <sunil.khatri@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x


# 13af5561 24-Apr-2023 Horatio Zhang <Hongkun.Zhang@amd.com>

drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini

The gmc.ecc_irq is enabled by firmware per IFWI setting,
and the host driver is not privileged to enable/disable
the interrupt. So, it is meaningless to use the amdgpu_irq_put
function in gmc_v11_0_hw_fini, which also leads to the call
trace.

[ 102.980303] Call Trace:
[ 102.980303] <TASK>
[ 102.980304] gmc_v11_0_hw_fini+0x54/0x90 [amdgpu]
[ 102.980357] gmc_v11_0_suspend+0xe/0x20 [amdgpu]
[ 102.980409] amdgpu_device_ip_suspend_phase2+0x240/0x460 [amdgpu]
[ 102.980459] amdgpu_device_ip_suspend+0x3d/0x80 [amdgpu]
[ 102.980520] amdgpu_device_pre_asic_reset+0xd9/0x490 [amdgpu]
[ 102.980573] amdgpu_device_gpu_recover.cold+0x548/0xce6 [amdgpu]
[ 102.980687] amdgpu_debugfs_reset_work+0x4c/0x70 [amdgpu]
[ 102.980740] process_one_work+0x21f/0x3f0
[ 102.980741] worker_thread+0x200/0x3e0
[ 102.980742] ? process_one_work+0x3f0/0x3f0
[ 102.980743] kthread+0xfd/0x130
[ 102.980743] ? kthread_complete_and_exit+0x20/0x20
[ 102.980744] ret_from_fork+0x22/0x30

Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522
Fixes: c8b5a95b5709 ("drm/amdgpu: Fix desktop freezed after gpu-reset")
Cc: stable@vger.kernel.org


# 277bd337 23-May-2022 Le Ma <le.ma@amd.com>

drm/amdgpu: convert gfx.kiq to array type (v3)

v1: more kiq instances are a available in SOC (Le)
v2: squash commits to avoid breaking the build (Le)
v3: make the conversion for gfx/mec v11_0 (Hawking)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0530553b 19-May-2022 Le Ma <le.ma@amd.com>

drm/amdgpu: move vmhub out of amdgpu_ring_funcs (v4)

It looks better to place this field in ring
structure. Also drop the repeated ring funcs definitions
if there's no difference except for vmhub field.

v2: rename the field to vm_hub like others (Le)
v3: apply the changes to new ip blocks (Hawking)
v4: fix vcn sw ring (Alex)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3e4bc662 17-Mar-2023 Lee Jones <lee@kernel.org>

drm/amd/amdgpu/gmc_v11_0: Provide a few missing param descriptions relating to hubs and flushes

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:282: warning: Function parameter or member 'vmhub' not described in 'gmc_v11_0_flush_gpu_tlb'
drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:282: warning: Function parameter or member 'flush_type' not described in 'gmc_v11_0_flush_gpu_tlb'
drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:322: warning: Function parameter or member 'flush_type' not described in 'gmc_v11_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:322: warning: Function parameter or member 'all_hub' not described in 'gmc_v11_0_flush_gpu_tlb_pasid'

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 057e335c 05-Mar-2023 Yifan Zha <Yifan.Zha@amd.com>

drm/amdgpu: Init MMVM_CONTEXTS_DISABLE in gmc11 golden setting under SRIOV

[Why]
If disable the mmhub vm contexts(set MMVM_CONTEXTS_DISABLE to 0xffff),
driver loading failed on vf due to fence fallback timer expired on all rings.
FLR cannot reset MMVM_CONTEXTS_DISABLE.
So this vf can not be recovered anymore unless trigger a whole gpu reset.

[How]
Under SRIOV, init MMVM_CONTEXTS_DISABLE in gmc11 golden register setting.

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Horace Chen <Horace.Chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a6dcf9a7 11-Mar-2023 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: Move umc ras block init to gmc ras sw_init

Initialize umc ras block only when umc ip block
supports ras. Driver queries ras capabilities after
early_init, ras block init needs to be moved to
sw_init.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2b595659 24-Feb-2023 Candice Li <candice.li@amd.com>

drm/amdgpu: Support umc node harvest config on umc v8_10

Don't need to query error count and error address on harvest umc nodes.
v2: Fix code bug, use active_mask instead of harvsest_config
and remove unnecessary argument in LOOP macro.
v3: Leave adev->gmc.num_umc unchanged.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 06630fb9 24-Feb-2023 Candice Li <candice.li@amd.com>

drm/amdgpu: Support umc node harvest config on umc v8_10

Don't need to query error count and error address on harvest umc nodes.
v2: Fix code bug, use active_mask instead of harvsest_config
and remove unnecessary argument in LOOP macro.
v3: Leave adev->gmc.num_umc unchanged.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e69c7857 16-Feb-2023 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: add umc retire unit element

It records how many bad pages are retired in one uncorrectable error.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c6eafee0 31-Jan-2023 Alex Deucher <alexander.deucher@amd.com>

Revert "Revert "drm/amdgpu/gmc11: enable AGP aperture""

This reverts commit 1a65327a84db5b9081a51ccb1c562083f59bfcec.

This should be resolved so we can re-enable this. Also,
the AGP apeture was bring programmed to 0 on MMHUB 3.0.1
since agp_start and end were not being set.

Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 735c7064 09-Nov-2022 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: don't touch gfxhub registers during S0ix

gfxhub registers are part of gfx IP and should not need to be
changed. Doing so without disabling gfxoff can hang the gfx IP.

v2: add comments explaining why we can skip the interrupt
control for S0i3

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d0ca8248 11-Oct-2022 Yifan Zhang <yifan1.zhang@amd.com>

drm/amdgpu: add gmc v11 support for GC 11.0.4

Add gmc v11 support for GC 11.0.4.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b6da3c58 26-Sep-2022 YiPeng Chai <YiPeng.Chai@amd.com>

drm/amdgpu: Add umc channel index mapping table for umc_v8_10

Add umc channel index mapping table for umc_v8_10.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d1a372af 26-Aug-2022 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Set MTYPE in PTE based on BO flags

The same BO may need different MTYPEs and SNOOP flags in PTEs depending
on its current location relative to the mapping GPU. Setting MTYPEs from
clients ahead of time is not practical for coherent memory sharing.
Instead determine the correct MTYPE for the desired coherence model and
current BO location when updating the page tables.

To maintain backwards compatibility with MTYPE-selection in
AMDGPU_VA_OP_MAP, the coherence-model-based MTYPE selection is only
applied if it chooses an MTYPE other than MTYPE_NC (the default).

Add two AMDGPU_GEM_CREATE_... flags to indicate the coherence model. The
default if no flag is specified is non-coherent (i.e. coarse-grained
coherent at dispatch boundaries).

Update amdgpu_amdkfd_gpuvm.c to use this new method to choose the
correct MTYPE depending on the current memory location.

v2:
* check that bo is not NULL (e.g. PRT mappings)
* Fix missing ~ bitmask in gmc_v11_0.c
v3:
* squash in "drm/amdgpu: Inherit coherence flags on dmabuf import"

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7e2c5832 07-Sep-2022 Yifan Zha <Yifan.Zha@amd.com>

drm/amdgpu: Program GC registers through RLCG interface in gfx_v11/gmc_v11

[Why]
L1 blocks most of GC registers accessing by MMIO.

[How]
Use RLCG interface to program GC registers under SRIOV VF in full access time.

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 97a3d609 07-Sep-2022 Yifan Zha <Yifan.Zha@amd.com>

drm/amdgpu: Program GC registers through RLCG interface in gfx_v11/gmc_v11

[Why]
L1 blocks most of GC registers accessing by MMIO.

[How]
Use RLCG interface to program GC registers under SRIOV VF in full access time.

Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 40ad3e54 27-Jul-2022 Yifan Zha <Yifan.Zha@amd.com>

drm/amdgpu: Skip the VRAM base offset on SRIOV

[Why]
As VF cannot read MMMC_VM_FB_OFFSET with L1 Policy(read 0xffffffff).
It leads to driver get the incorrect vram base offset.

[How]
Since SR-IOV is dGPU only, skip reading this register and set the
fb_offest to 0.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5818eae5 31-Jul-2022 Yifan Zha <Yifan.Zha@amd.com>

drm/amdgpu: skip "Issue additional private vm invalidation to MMHUB" on SRIOV

[Why]
vm_l2_bank_select_reserved_cid2 is a PF_only register
that cannot be programmed by VF. This feature is only
support HDP using GPUVM page tables to access FB memory
which should be disabled on SRIOV.

[How]
Disable the feature on VF.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Yifan Zha <Yifan.Zha@amd.com>
Signed-off-by: Horace Chen <horace.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fe09f343 27-Jun-2022 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: initialize gmc sw config for v11_0_3

initialize gmc sw config for v11_0_3

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Frank Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9436ac31 01-Jul-2022 Yang Wang <KevinYang.Wang@amd.com>

drm/amdgpu: add gfxhub_v3_0_3 support

add gfxhub_v3_0_3 support

Signed-off-by: Yang Wang <KevinYang.Wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4c5aa594 24-Jul-2022 Aaron Liu <aaron.liu@amd.com>

drm/amdgpu: enable swiotlb for gmc 11.0

Enable swiotlb for gmc 11.0.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e4b1edf4 04-Jul-2022 YiPeng Chai <YiPeng.Chai@amd.com>

drm/amdgpu: add umc ras functions for umc v8_10_0

1. Support query umc ras error counter.
2. Support ras umc ue error address remapping.

Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com>
Reviewed-by: Alexander Deucher <Alexander.Deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 395ece6f 27-Jun-2022 Jack Xiao <Jack.Xiao@amd.com>

Revert "drm/amdgpu/gmc11: avoid cpu accessing registers to flush VM"

This reverts commit 8748de873fedf4d55bdd99bbb738ee7ddf329792
since drv enabled mes to access registers.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cf606729 16-Jun-2022 Jack Xiao <Jack.Xiao@amd.com>

drm/amdgpu: enable mes to access registers v2

Enable mes to access registers.

v2: squash mes sched ring enablement flag

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8748de87 19-Apr-2022 Jack Xiao <Jack.Xiao@amd.com>

drm/amdgpu/gmc11: avoid cpu accessing registers to flush VM

Due to gfxoff on, cpu accessing registers is not expected.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b3e73cbf 18-May-2022 Graham Sider <Graham.Sider@amd.com>

drm/amdgpu: Remove break for VMID loop TLB flush on MES

Loop through all VMIDs for gmc_v10_0_flush_gpu_tlb_pasid and
gmc_v11_0_flush_gpu_tlb_pasid (only if using MES for gmc_v10). This is
required for MES due to use_different_vmid_compute causing SDMA queues
to be assigned different VMIDs than compute for the same PASID.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1a65327a 13-Jun-2022 Chengming Gui <Jack.Gui@amd.com>

Revert "drm/amdgpu/gmc11: enable AGP aperture"

This reverts commit 2cfe34e18970d26bff73c63f16c76dae22138d19.
Enable AGP aperture cause SDMA page fault for gfx11.0.2,
so temp disable AGP aperture until SDMA FW resolved this.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2cfe34e1 25-May-2022 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc11: enable AGP aperture

Enable the AGP aperture on chips with GMC v11.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 10c4ad3a 12-May-2022 Huang Rui <ray.huang@amd.com>

drm/amdgpu: add mmhub v3_0_1 ip block

This adds mmhub v3_0_1 ip block support

v2: rebase (Alex)

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ae969b62 25-May-2022 Roman Li <Roman.Li@amd.com>

drm/amdgpu: fix aper_base for APU

[Why]
Wrong fb offset results in dmub f/w errors and white screen.
[drm:dc_dmub_srv_wait_idle [amdgpu]] *ERROR* Error waiting for DMUB idle: status=3

[How]
Read aper_base from mmhub because GC is off by default

v2: use BAR for passthrough (Alex)

Signed-off-by: Roman Li <Roman.Li@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b6c65a2c 06-May-2022 Christian König <christian.koenig@amd.com>

drm/amdgpu: add AMDGPU_VM_NOALLOC v2

Add the AMDGPU_VM_NOALLOC flag to let userspace control MALL allocation.

v2: also add the flag to the allowed flags.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3cc69021 21-Apr-2022 Graham Sider <Graham.Sider@amd.com>

drm/amdgpu: Implement get_vmid_pasid_mapping for gfx11

Implement gmc_v11_0_get_vmid_pasid_mapping_info to fix
gmc_v11_0_flush_gpu_tlb_pasid logic. Change from gfx10 to use
IH_VMID_*_LUT registers for VMID -> PASID mapping.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ee367aed 01-Aug-2021 Huang Rui <ray.huang@amd.com>

drm/amdgpu: add gmc v11 support for GC 11.0.1

Add gmc v11 support for GC 11.0.1.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f2754bf7 15-Apr-2022 Flora Cui <flora.cui@amd.com>

drm/amdgpu: add GMC11 support for GC 11.0.2

Add initial support for GC 11.0.2.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 89ae779b 15-Apr-2022 Flora Cui <flora.cui@amd.com>

drm/amdgpu: add UMC 8.11.0 support

Add initial support for UMC 8.11.0.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f40fc191 06-Feb-2022 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: split mmhub v3_0_2 callbacks from mmhub v3_0

So we don't need to add ip version check in every callback
when there is atc related programming that is only
available in mmhub v3_0

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 18ee4ce6 13-Apr-2022 Jack Xiao <Jack.Xiao@amd.com>

drm/amdgpu: add mes unmap legacy queue routine

For mes kiq has been taken over by mes sched, drv can't directly
use mes kiq to unmap queues. drv has to use mes sched api to
unmap legacy queue.

Signed-off-by: Jack Xiao <Jack.Xiao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1c2014da 06-Apr-2022 Tianci.Yin <tianci.yin@amd.com>

drm/amdgpu: add gmc v11_0 ip block (v3)

Add support for GPU memory controller v11.

v1: Add support for gmc v11.0
Add gmc 11 block (Tianci)
v2: drop unused amdgpu_bo_late_init (Hawking)
v3: squash in various fix

Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>