History log of /linux-master/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
Revision Date Author Comments
# b8f67b9d 18-Jan-2024 Shashank Sharma <shashank.sharma@amd.com>

drm/amdgpu: change vm->task_info handling

This patch changes the handling and lifecycle of vm->task_info object.
The major changes are:
- vm->task_info is a dynamically allocated ptr now, and its uasge is
reference counted.
- introducing two new helper funcs for task_info lifecycle management
- amdgpu_vm_get_task_info: reference counts up task_info before
returning this info
- amdgpu_vm_put_task_info: reference counts down task_info
- last put to task_info() frees task_info from the vm.

This patch also does logistical changes required for existing usage
of vm->task_info.

V2: Do not block all the prints when task_info not found (Felix)

V3: Fixed review comments from Felix
- Fix wrong indentation
- No debug message for -ENOMEM
- Add NULL check for task_info
- Do not duplicate the debug messages (ti vs no ti)
- Get first reference of task_info in vm_init(), put last
in vm_fini()

V4: Fixed review comments from Felix
- fix double reference increment in create_task_info
- change amdgpu_vm_get_task_info_pasid
- additional changes in amdgpu_gem.c while porting

Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Shashank Sharma <shashank.sharma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8f4de8f7 19-Dec-2023 Victor Lu <victorchengchi.lu@amd.com>

drm/amdgpu: Use correct SRIOV macro for gmc_v9_0_vm_fault_interrupt_state

Under SRIOV, programming to VM_CONTEXT*_CNTL regs failed because the
current macro does not pass through the correct xcc instance.
Use the *REG32_XCC macro in this case.

The behaviour without SRIOV is the same without this patch.

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Zhigang Luo <Zhigang.Luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3d1554d9 31-Jan-2024 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Avoid fetching VRAM vendor info

The present way to fetch VRAM vendor information turns out to be not
reliable on GFX 9.4.3 dGPUs as well. Avoid using the data.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fa8a91b0 29-Jan-2024 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()'

Return 0 for success scenairos in 'gmc_v6/7/8/9_0_hw_init()'

Fixes the below:
drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:920 gmc_v6_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1104 gmc_v7_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1224 gmc_v8_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2347 gmc_v9_0_hw_init() warn: missing error code? 'r'

Fixes: fac4ebd79fed ("drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e0eb08dc 20-Jan-2024 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Avoid fetching vram vendor information

For GFX 9.4.3 APUs, the current method of fetching vram vendor
information is not reliable. Avoid fetching the information.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 26405ff4 13-Dec-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move kiq_reg_write_reg_wait() out of amdgpu_virt.c

It's used for more than just SR-IOV now, so move it to
amdgpu_gmc.c and rename it to better match the functionality and
update the comments in the code paths to better document
when each path is used and why. No functional change.

Reviewed-by: Shaoyun.liu <Shaoyun.liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Shaoyun.Liu@amd.com
Cc: Christian.Koenig@amd.com


# 55173942 31-Jan-2024 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Avoid fetching VRAM vendor info

The present way to fetch VRAM vendor information turns out to be not
reliable on GFX 9.4.3 dGPUs as well. Avoid using the data.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 16da3990 29-Jan-2024 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()'

Return 0 for success scenairos in 'gmc_v6/7/8/9_0_hw_init()'

Fixes the below:
drivers/gpu/drm/amd/amdgpu/gmc_v6_0.c:920 gmc_v6_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:1104 gmc_v7_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:1224 gmc_v8_0_hw_init() warn: missing error code? 'r'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:2347 gmc_v9_0_hw_init() warn: missing error code? 'r'

Fixes: fac4ebd79fed ("drm/amdgpu: Fix with right return code '-EIO' in 'amdgpu_gmc_vram_checking()'")
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 90751bde 20-Jan-2024 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Avoid fetching vram vendor information

For GFX 9.4.3 APUs, the current method of fetching vram vendor
information is not reliable. Avoid fetching the information.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.7.x


# a32c6f7f 15-Dec-2023 Stanley.Yang <Stanley.Yang@amd.com>

drm/amdgpu: Fix ecc irq enable/disable unpaired

The ecc_irq is disabled while GPU mode2 reset suspending process,
but not be enabled during GPU mode2 reset resume process.

Changed from V1:
only do sdma/gfx ras_late_init in aldebaran_mode2_restore_ip
delete amdgpu_ras_late_resume function

Changed from V2:
check umc ras supported before put ecc_irq

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ed342a2e 01-Dec-2023 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Use the right method to get IP version

Replace direct usage of adev->ip_versions with amdgpu_ip_version.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e8c2d3e2 09-Nov-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: disable AGP aperture

We've had misc reports of random IOMMU page faults when
this is used. It's just a rarely used optimization anyway, so
let's just disable it. It can still be toggled via the
module parameter for testing.

v2: leave it configurable via module parameter

Reviewed-by: Yang Wang <kevinyang.wang@amd.com> (v1)
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6ba5b613 09-Nov-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add a module parameter to control the AGP aperture

Add a module parameter to control the AGP aperture. The AGP
aperture is an aperture in the GPU's internal address space
which provides direct non-paged access to the platform address
space. This access is non-snooped so only uncached memory
can be accessed.

Add a knob so that we can toggle this for debugging.

Fixes: 67318cb84341 ("drm/amdgpu/gmc11: set gart placement GC11")
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Tested-by: Mario Limonciello <mario.limonciello@amd.com> # PHX & Navi33
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bdb72185 13-Nov-2023 Le Ma <le.ma@amd.com>

drm/amdgpu: finalizing mem_partitions at the end of GMC v9 sw_fini

The valid num_mem_partitions is required during ttm pool fini,
thus move the cleanup at the end of the function.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4abf0b0b 07-Nov-2023 David Yat Sin <David.YatSin@amd.com>

drm/amdgpu: Change extended-scope MTYPE on GC 9.4.3

Change local memory type to MTYPE_UC on revision id 0

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bc3c5660 08-Aug-2023 Victor Lu <victorchengchi.lu@amd.com>

drm/amdgpu: Add xcc param to SRIOV kiq write and WREG32_SOC15_IP_NO_KIQ (v4)

WREG32/RREG32_SOC15_IP_NO_KIQ and amdgpu_virt_kiq_reg_write_reg_wait
are not using the correct rlcg interface or mec engine, respectively.

Add xcc instance parameter to them.

v4: Use GET_INST and squash commit with:
"drm/amdgpu: Add xcc_inst param to amdgpu_virt_kiq_reg_write_reg_wait"

v3: xcc not needed for MMMHUB

v2: rebase

Signed-off-by: Victor Lu <victorchengchi.lu@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bff3315b 07-Nov-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix AGP init order

The default AGP settings were overwriting the IP selected
ones since the default was getting set after the IP ones
were selected.

Fixes: de59b69932e6 ("drm/amdgpu/gmc: set a default disable value for AGP")
Link: https://lists.freedesktop.org/archives/amd-gfx/2023-November/100966.html
Tested-by: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: Mikhail Gavrilov <mikhail.v.gavrilov@gmail.com>


# 142262a1 12-Oct-2023 David Francis <David.Francis@amd.com>

drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems

On gfx943 APU, EXT_COHERENT should give MTYPE_CC for local and
MTYPE_UC for nonlocal memory.

On NUMA systems, local memory gets the local mtype, set by an
override callback. If EXT_COHERENT is set, memory will be set as
MTYPE_UC by default, with local memory MTYPE_CC.

Add an option in the override function for this case, and
add a check to ensure it is not used on UNCACHED memory.

V2: Combined APU and NUMA code into one patch
V3: Fixed a potential nullptr in amdgpu_vm_bo_update

Signed-off-by: David Francis <David.Francis@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9ee81928 17-Oct-2023 Lin.Cao <lincao12@amd.com>

drm/amdgpu remove restriction of sriov max_pfn on Vega10

Remove restriction of sriov max_pfn so that TBA and TMA can move to high
47 bits address.

Regression test: change range alloc flag of libdrm as
AMDGPU_VA_RANGE_HIGH and there is no flr occur when testing amdgpu_test
of drm.

Signed-off-by: Lin.Cao <lincao12@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 21226f02 18-Oct-2023 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: replace reset_error_count with amdgpu_ras_reset_error_count

Simplify the code.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fdac8909 28-Sep-2023 Philip Yang <Philip.Yang@amd.com>

drm/amdgpu: ratelimited override pte flags messages

Use ratelimited version of dev_dbg to avoid flooding dmesg log. No
functional change.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8dbf1ba8 29-Sep-2022 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: cache gpuvm fault information for gmc7+

Cache the current fault info in the vm struct. This can be queried
by userspace later to help debug UMDs.

Cc: samuel.pitoiset@gmail.com
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f4bff6e0 02-Oct-2023 Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>

drm/amdgpu: Use ttm_pages_limit to override vram reporting

On GFXIP9.4.3 APU, allow the memory reporting as per the ttm pages
limit in NPS1 mode.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 917f91d8 14-Sep-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc: add a way to force a particular placement for GART

We normally place GART based on the location of VRAM and the
available address space around that, but provide an option
to force a particular location for hardware that needs it.

v2: Switch to passing the placement via parameter

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# de59b699 20-Sep-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc: set a default disable value for AGP

To disable AGP, the start needs to be set to a higher
value than the end. Set a default disable value for
the AGP aperture and allow the IP specific GMC code
to enable it selectively be calling amdgpu_gmc_agp_location().

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 08abccc9 04-Sep-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: further move TLB hw workarounds a layer up

For the PASID flushing we already handled that at a higher layer, apply
those workarounds to the standard flush as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e2e37888 04-Sep-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: rework lock handling for flush_tlb v2

Instead of each implementation doing this more or less correctly
move taking the reset lock at a higher level.

v2: fix typo

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3983c9fd 04-Sep-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: drop error return from flush_gpu_tlb_pasid

That function never fails, drop the error return.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e7b90e99 04-Sep-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb_pasid

Testing for reset is pointless since the reset can start right after the
test.

The same PASID can be used by more than one VMID, invalidate each of them.

Move the KIQ and all the workaround handling into common GMC code.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6205b558 19-Sep-2023 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: fix value of some UMC parameters for UMC v12

Prepare for bad page retirement.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 24a6eb92 31-Aug-2023 Christian König <christian.koenig@amd.com>

drm/amdgpu: fix and cleanup gmc_v9_0_flush_gpu_tlb

The KIQ code path was ignoring the second flush. Also avoid long lines and
re-calculating the register offsets over and over again.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5f248462 21-Jul-2023 David Francis <David.Francis@amd.com>

drm/amdgpu: Add EXT_COHERENT memory allocation flags

These flags (for GEM and SVM allocations) allocate
memory that allows for system-scope atomic semantics.

On GFX943 these flags cause caches to be avoided on
non-local memory.

On all other ASICs they are identical in functionality to the
equivalent COHERENT flags.

Corresponding Thunk patch is at
https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88

Reviewed-by: David Yat Sin <David.YatSin@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4e8303cf 11-Sep-2023 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Use function for IP version check

Use an inline function for version check. Gives more flexibility to
handle any format changes.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3cb9ebc9 26-Jul-2023 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: add channel index table for UMC v12

Get UMC phyical channel index according to node id, umc instance and
channel instance.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7e6ec099 10-May-2023 Candice Li <candice.li@amd.com>

drm/amdgpu: Add umc v12_0 ras functions

Add umc v12_0 ras error querying.

Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f54e1d47 18-Aug-2023 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Fix kcalloc over kzalloc in 'gmc_v9_0_init_mem_ranges'

Replace kzalloc(n * sizeof(...), ...) with kcalloc(n, sizeof(...), ...)
since kcalloc is the preferred API in case of allocating with multiply.

Fixes the below:

WARNING: Prefer kcalloc over kzalloc with multiply

Cc: Guchun Chen <guchun.chen@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e20ff051 03-Aug-2023 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Add memory vendor information

For ASICs with GC v9.4.3, determine the vendor information from scratch
register.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e2710187 04-Jul-2023 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Prefer dev_warn over printk

Fix the below warning:

WARNING: Prefer [subsystem eg: netdev]_warn([subsystem]dev, ... then
dev_warn(dev, ... then pr_warn(... to printk(KERN_WARNING ...

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 08e85215 30-Jun-2023 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Fix error & warnings in gmc_v9_0.c

Fix below checkpatch error & warnings:

ERROR: that open brace { should be on the previous line

WARNING: static const char * array should probably be static const char * const
WARNING: Block comments use * on subsequent lines
WARNING: Block comments use a trailing */ on a separate line

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e77673d1 09-Jun-2023 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Update invalid PTE flag setting

Update the invalid PTE flag setting with TF enabled.
This is to ensure, in addition to transitioning the
retry fault to a no-retry fault, it also causes the
wavefront to enter the trap handler. With the current
setting, the fault only transitions to a no-retry fault.
Additionally, have 2 sets of invalid PTE settings, one for
TF enabled, the other for TF disabled. The setting with
TF disabled, doesn't work with TF enabled.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d728eda3 03-May-2023 Philip Yang <Philip.Yang@amd.com>

drm/amdgpu: Enable translate further for GC v9.4.3

To extend UTCL2 reach.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1bae03aa 30-May-2023 Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>

drm/amdgpu: Fix up missing parameter in kdoc for 'inst' in gmc_ v7, v8, v9, v10, v11.c

Fix these warnings by adding 'inst' arguments to kdocs.

gcc with W=1
drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c:428: warning: Function parameter or member 'inst' not described in 'gmc_v7_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c:626: warning: Function parameter or member 'inst' not described in 'gmc_v8_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:423: warning: Function parameter or member 'inst' not described in 'gmc_v10_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v11_0.c:328: warning: Function parameter or member 'inst' not described in 'gmc_v11_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:950: warning: Function parameter or member 'inst' not described in 'gmc_v9_0_flush_gpu_tlb_pasid'

Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9535a86a 17-May-2023 Shiwu Zhang <shiwu.zhang@amd.com>

drm/amdgpu: bypass bios dependent operations

Since bios reading does not work currently so just bypass all operations
related to bios

v2: hardcode the vram info for APP_APU case (hawking)
v3: correct the vram_width with channel number * channel size (lijo)

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6dabce86 22-May-2023 Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>

drm/amdgpu: Fix unsigned comparison with zero in gmc_v9_0_process_interrupt()

Smatch warns:
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:579:
unsigned 'xcc_id' is never less than zero.

gfx_v9_4_3_ih_to_xcc_inst() returns negative numbers as well.
Fix this by changing type of xcc_id to int.

Fixes: 98b2e9cad227 ("drm/amdgpu: correct the vmhub index when page fault occurs")
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 45b3a914 16-May-2023 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: fix 64 bit division in partition code

Rework logic or use do_div() to avoid problems on 32 bit.

v2: add a missing case for XCP macro
v3: fix out of bounds array access
v4: fix xcp handling harder

Acked-by: Guchun Chen <guchun.chen@amd.com> (v1)
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> (v3)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3ebfd221 08-Mar-2023 Philip Yang <Philip.Yang@amd.com>

drm/amdkfd: Store xcp partition id to amdgpu bo

For memory accounting per compute partition and export drm amdgpu bo and
then import to KFD, we need the xcp id to account the memory usage or
find the KFD node of the original amdgpu bo to create the KFD bo on the
correct adev KFD node.

Set xcp_id_plus1 of amdgpu_bo_param to create bo and store xcp_id to
amddgpu bo. Add helper macro to get the mem_id from adev and xcp_id.

v2: squash in fix ("drm/amdgpu: Fix BO creation failure on GFX 9.4.3 dGPU")

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# dc12f9ed 02-Feb-2023 Philip Yang <Philip.Yang@amd.com>

drm/amdkfd: Update MTYPE for far memory partition

Use MTYPE RW/MTYPE_CC for mapping system memory or VRAM to KFD node
within the same memory partition, use MTYPE_NC for mapping on KFD node
from the far memory partition of the same socket or from another socket
on same XGMI hive.

On NPS4 or 4P system, MTYPE will be overridden per page depending on
the memory NUMA node id and vm->mem_id.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b9cbd510 06-Mar-2023 Graham Sider <Graham.Sider@amd.com>

drm/amdgpu/bu: update mtype_local parameter settings

Update mtype_local module parameter to use MTYPE_RW by default.

0: MTYPE_RW (default)
1: MTYPE_NC
2: MTYPE_CC

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 76eb9c95 27-Feb-2023 David Francis <David.Francis@amd.com>

drm/amdgpu/bu: add mtype_local as a module parameter

Selects the MTYPE to be used for local memory,
(0 = MTYPE_CC (default), 1 = MTYPE_NC, 2 = MTYPE_RW)

v2: squash in build fix (Alex)

Reviewed-by: Graham Sider <Graham.Sider@amd.com>
Signed-off-by: David Francis <David.Francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 352b919c 21-Feb-2023 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Override MTYPE per page on GFXv9.4.3 APUs

On GFXv9.4.3 NUMA APUs, system memory locality must be determined per
page to choose the correct MTYPE. This patch adds a GMC callback that
can provide this per-page override and implements it for native mode.

Carve-out mode is not yet supported and will use the safe default
(remote) MTYPE for system memory.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1e4a0033 21-Feb-2023 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Fix per-BO MTYPE selection for GFXv9.4.3

Treat system memory on NUMA systems as remote by default. Overriding with
a more efficient MTYPE per page will be implemented in the next patch.

No need for a special case for APP APUs. System memory is handled the same
for carve-out and native mode. And VRAM doesn't exist in native mode.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-and-tested-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 895797d9 06-Feb-2023 Graham Sider <Graham.Sider@amd.com>

drm/amdgpu/bu: Add use_mtype_cc_wa module param

By default, set use_mtype_cc_wa to 1 to set PTE coherence flag MTYPE_CC
instead of MTYPE_RW by default. This is required for the time being to
mitigate a bug causing XCCs to hit stale data due to TCC marking fully
dirty lines as exclusive.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2e8cc5d3 08-Feb-2023 Graham Sider <Graham.Sider@amd.com>

drm/amdgpu: Use legacy TLB flush for gfx943

Invalidate TLBs via a legacy flush request (flush_type=0) prior to the
heavyweight flush requests (flush_type=2) in gmc_v9_0.c. This is
temporarily required to mitigate a bug causing CPC UTCL1 to return stale
translations after invalidation requests in address range mode.

v2: squash in long term fix "drm/amdgpu: disable extra gfx943 legacy flush on rev1+"

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 46f7b4de 10-Apr-2023 Gavin Wan <Gavin.Wan@amd.com>

drm/amdgpu: Set memory partitions to 1 for SRIOV.

For SRIOV, the memory partitions are set on host drover. Each VF only
has one memory partition. We need set the memory partitions to 1 on
guest driver for SRIOV.

V2: sqaush in fix ("drm/amdgpu: Fix memory range info of GC 9.4.3 VFs")

Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Acked-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b0a3bbf9 03-Apr-2023 Gavin Wan <Gavin.Wan@amd.com>

drm/amdgpu: Skip using MC FB Offset when APU flag is set for SRIOV.

The MC_VM_FB_OFFSET is PF only register. It cannot be read on VF.
So, the driver should not use MC_VM_FB_OFFSET address to set the
address of dev->gmc.aper_base.

Signed-off-by: Gavin Wan <Gavin.Wan@amd.com>
Reviewed-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a433f1f5 14-Feb-2023 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Initialize memory ranges for GC 9.4.3

GC 9.4.3 ASICS may have memory split into multiple partitions.Initialize
the memory partition information for each range. The information may be
in the form of a numa node id or a range of pages.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0f2e1d62 16-Feb-2023 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Get supported memory partition modes

Expand the interface to get supported memory partition modes also along
with the current memory partition mode.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b6f90baa 30-Jan-2023 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Move memory partition query to gmc

GMC block handles memory related information, it makes more sense to
keep memory partition functions in gmc block.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 228ce176 27-Jan-2023 Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>

drm/amdgpu: Handle VRAM dependencies on GFXIP9.4.3

[For 1P NPS1 mode driver bringup]

Changes required to initialize the amdgpu driver with frontdoor firmware
loading and discovery=2 with the native mode SBIOS that enables CPU GPU
unified interleaved memory.

sudo modprobe amdgpu discovery=2

Once PSP TMR region is reported via the ACPI interface, the dependency
on the ip_discovery.bin will be removed.

Choice of where to allocate driver table is given to each IP version. In
general, both GTT and VRAM domains will be considered. If one of the
tables has a strict restriction for VRAM domain, then only VRAM domain
is considered.

Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
(lijo: Modified the handling for SMU Tables)
Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 73c2b3fd 22-Jan-2023 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: Initialize mmhub v1_8 ras function

Initialize mmhub v1_8 ras function.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d839a158 05-Jan-2023 Graham Sider <Graham.Sider@amd.com>

drm/amdgpu: Correct dGPU MTYPE settings for gfx943

Revert temporary dGPU VRAM MTYPE setting and align with expected
coherency protocol.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c9a502e9 28-Nov-2022 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Allocate GART table in RAM for AMD APU

Some AMD APUs may not have a dedicated VRAM. On such platforms the GART
table should be allocated on the system memory. When real vram size is
zero, place the GART table in system memory and create an SG BO to make
it GPU accessible.

v2: fix includes

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
(rajneesh: removed set_memory_wc workaround)
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 12c4d7ed 15-Dec-2022 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Fix GFX 9.4.3 dma address capability

ASICs with GFX 9.4.3 support 48-bit addressing.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a0a0c69c 13-Dec-2022 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Fix semaphore release

Use the right register for semaphore release during invalidation.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 98b2e9ca 09-Dec-2022 Le Ma <le.ma@amd.com>

drm/amdgpu: correct the vmhub index when page fault occurs

The AMDGPU_GFXHUB was bind to each xcc in the logical order.
Thus convert the node_id to logical xcc_id to index the
correct AMDGPU_GFXHUB. And "node_id / 4" can get the correct
AMDGPU_MMHUB0 index.

Signed-off-by: Le Ma <le.ma@amd.com>
Tested-by: Asad kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 753b999a 06-Dec-2022 Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>

drm/amdgpu: set MTYPE in PTE for GFXIP 9.4.3

Apply the GFXIP 9.4.3 specific snoop and mtype settings for various
scenarios such as APU, APU in Carveout mode and dGPU mode.

Note: This is expected to change due to:
1 - NPS > 1 support in future
2 - Hardware bugs found during initial asic bringup.

Cc: Graham Sider <graham.sider@amd.com>
Cc: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7a1efad0 29-Nov-2022 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Use mask for active clusters

Use a mask of available active clusters instead of using only the number
of active clusters.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 497db7ea 08-Nov-2022 Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>

drm/amdgpu: Check APU supports true APP mode

On GPXIP 9.4.3 APU, in no carveout mode there is no real vram heap and
could be emulated by the driver over the interleaved NUMA system memory
and the APU could also be in the carveout mode during early development
stage or otherwise for debugging purpose so introduce a new member in
amdgpu_gmc to figure out whether the APU is in the native mode as per
the production configuration. AMD_IS_APU cannot be used for Accelerated
Processing Platform APUs as it might be used in a different context on
previous generations or on small APUs.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Tested-by: Graham Sider <graham.sider@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# eaae4beee 14-Nov-2022 Philip Yang <Philip.Yang@amd.com>

drm/amdgpu: more GPU page fault info for GC v9.4.3

Output IH cookie node_id and translate it to the corresponding AID id
and XCC id, to help debug the GPU page fault.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f5fe7edf 30-Sep-2022 Mukul Joshi <mukul.joshi@amd.com>

drm/amdkfd: Update interrupt handling for GFX9.4.3

Update interrupt handling in CPX mode for GFX9.4.3 by using the
VMID space instead of SDMA client id to determine if an interrupt
should be processed by a KFD node. This is especially needed for
handling retry faults from MMHUB.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5fb34bd9 24-May-2022 Alex Sierra <alex.sierra@amd.com>

drm/amdkfd: pass kfd_node ref to svm migration api

This work is required for GC 9.4.3, previous to support memory
partitions per node at SVM. When multiple partition is configured,
every BO should be allocated inside one specific partition which
corresponds to the current amdgpu_device and kfd_node.

v2: squash in compilation fix (Alex)
v3: squash in fix for pre-gfx 9.4.3 (Alex)
v4: squash in best_loc fix (Alex)

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8078f1c6 29-Jun-2022 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Change num_xcd to xcc_mask

Instead of number of XCCs, keep a mask of XCCs for the exact XCCs
available on the ASIC. XCC configuration could differ based on
different ASIC configs.

v2:
Rename num_xcd to num_xcc (Hawking)
Use smaller xcc_mask size, changed to u16 (Le)

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9eb7681f 21-Feb-2022 Shiwu Zhang <shiwu.zhang@amd.com>

drm/amdgpu: add the support of XGMI link for GC 9.4.3

Add the xgmi LFB_CNTL/LBF_SIZE reg addresses to fetch the xgmi info from.

v2: move get_xgmi_info() to GC_V9_4_3 sepecific source files to utilize
the register definitions specific for GC_V9_4_3
v3: remove the duplicated register definitions
v4: enable xgmi based on asic_type as XGMI_IP ver is not available
yet for IP discovery

Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Ack-by: Lijo Lazar <Lijo.Lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5de6bd6a 25-Feb-2022 Le Ma <le.ma@amd.com>

drm/amdgpu: set mmhub bitmask for multiple AIDs

Like GFXHUB, set MMHUB0 bitmask for each AID.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 21e1217b 09-May-2022 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Fix VM fault reporting on XCC1

Fix VM fault reporting and clear VM fault register
for XCC1.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f87f6864 09-May-2022 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Add XCC inst to PASID TLB flushing

Add XCC instance to select the correct KIQ ring when
flushing TLBs on a multi-XCC setup.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ce8a12a5 20-Dec-2021 Le Ma <le.ma@amd.com>

drm/amdgpu: init vmhubs bitmask for GC 9.4.3

Each XCD owns one GFXHUB.

v2: switch to the new VMHUB layout

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d9426c3d 20-Dec-2021 Le Ma <le.ma@amd.com>

drm/amdgpu: add bitmask to iterate vmhubs

As the layout of VMHUB definition has been changed to cover multiple
XCD/AID case, the original num_vmhubs is not appropriate to do vmhub
iteration any more.

Drop num_vmhubs and introduce vmhubs_mask instead.

v2: switch to the new VMHUB layout
v3: use DECLARE_BITMAP to define vmhubs_mask

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f4caf584 14-Sep-2022 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: introduce vmhub definition for multi-partition cases (v3)

v1: Each partition has its own gfxhub or mmhub. adjust
the num of MAX_VMHUBS and the GFXHUB/MMHUB layout (Le)

v2: re-design the AMDGPU_GFXHUB/AMDGPU_MMHUB layout (Le)

v3: apply the gfxhub/mmhub layout to new IPs (Hawking)

v4: fix up gmc11 (Alex)

v5: rebase (Alex)

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c0c27428 02-May-2023 Hamza Mahfooz <hamza.mahfooz@amd.com>

drm/amdgpu: fix an amdgpu_irq_put() issue in gmc_v9_0_hw_fini()

As made mention of in commit c56edea58c31 ("drm/amdgpu: fix
amdgpu_irq_put call trace in gmc_v10_0_hw_fini") and commit aa6ac247ed7d
("drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini"). It
is meaningless to call amdgpu_irq_put() for gmc.ecc_irq. So, remove it
from gmc_v9_0_hw_fini().

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522
Fixes: c8b5a95b5709 ("drm/amdgpu: Fix desktop freezed after gpu-reset")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 922a76ba 02-May-2023 Hamza Mahfooz <hamza.mahfooz@amd.com>

drm/amdgpu: fix an amdgpu_irq_put() issue in gmc_v9_0_hw_fini()

As made mention of in commit 08c677cb0b43 ("drm/amdgpu: fix
amdgpu_irq_put call trace in gmc_v10_0_hw_fini") and commit 13af556104fa
("drm/amdgpu: fix amdgpu_irq_put call trace in gmc_v11_0_hw_fini"). It
is meaningless to call amdgpu_irq_put() for gmc.ecc_irq. So, remove it
from gmc_v9_0_hw_fini().

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2522
Fixes: 3029c855d79f ("drm/amdgpu: Fix desktop freezed after gpu-reset")
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 277bd337 23-May-2022 Le Ma <le.ma@amd.com>

drm/amdgpu: convert gfx.kiq to array type (v3)

v1: more kiq instances are a available in SOC (Le)
v2: squash commits to avoid breaking the build (Le)
v3: make the conversion for gfx/mec v11_0 (Hawking)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0530553b 19-May-2022 Le Ma <le.ma@amd.com>

drm/amdgpu: move vmhub out of amdgpu_ring_funcs (v4)

It looks better to place this field in ring
structure. Also drop the repeated ring funcs definitions
if there's no difference except for vmhub field.

v2: rename the field to vm_hub like others (Le)
v3: apply the changes to new ip blocks (Hawking)
v4: fix vcn sw ring (Alex)

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 318e431b 11-Apr-2023 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Enable IH retry CAM on GFX9

This patch enables the IH retry CAM on GFX9 series cards. This
retry filter is used to prevent sending lots of retry interrupts
in a short span of time and overflowing the IH ring buffer. This
will also help reduce CPU interrupt workload.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 018f7300 14-Jul-2021 Le Ma <le.ma@amd.com>

drm/amdgpu: initialize gfxhub v1_2 and mmhub v1_8 funcs

Initialize gfxhub1.2 and mmhub1.8 function calls

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ab1a157e 25-Nov-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: add gmc ip block support for GC 9.4.3

Initialize various gmc sw/hw settings/configurations
for GC 9.4.3.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# da9d669e 04-Mar-2023 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: Rework xgmi_wafl_pcs ras sw_init

To align with other IP blocks.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7f544c54 14-Mar-2023 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: Rework mca ras sw_init

To align with other IP blocks

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 474e2d49 04-Mar-2023 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: Move hdp ras block init to ras sw_init

Initialize hdp ras block only when mmhub ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_init.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fec70a86 04-Mar-2023 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: Move mmhub ras block init to ras sw_init

Initialize mmhub ras block only when mmhub ip block
supports ras features. Driver queries ras capabilities
after early_init, ras block init needs to be moved to
sw_init.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a6dcf9a7 11-Mar-2023 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: Move umc ras block init to gmc ras sw_init

Initialize umc ras block only when umc ip block
supports ras. Driver queries ras capabilities after
early_init, ras block init needs to be moved to
sw_init.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Stanley Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e69c7857 16-Feb-2023 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: add umc retire unit element

It records how many bad pages are retired in one uncorrectable error.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Stanley.Yang <Stanley.Yang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# da2f9920 04-Feb-2022 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup visible vram size handling

Centralize the limit handling and validation in one place instead
of spreading that around in different hw generations.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b93df61d 08-Nov-2022 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: don't touch gfxhub registers during S0ix

gfxhub registers are part of gfx IP and should not need to be
changed. Doing so without disabling gfxoff can hang the gfx IP.

v2: add comments explaining why we can skip the interrupt
control for S0i3

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9c3db58b 07-Dec-2022 Christian König <christian.koenig@amd.com>

drm/amdgpu: fixx NULL pointer deref in gmc_v9_0_get_vm_pte

We not only need to make sure that we have a BO, but also that the BO
has some backing store.

Fixes: d1a372af1c3d ("drm/amdgpu: Set MTYPE in PTE based on BO flags")
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d1a372af 26-Aug-2022 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Set MTYPE in PTE based on BO flags

The same BO may need different MTYPEs and SNOOP flags in PTEs depending
on its current location relative to the mapping GPU. Setting MTYPEs from
clients ahead of time is not practical for coherent memory sharing.
Instead determine the correct MTYPE for the desired coherence model and
current BO location when updating the page tables.

To maintain backwards compatibility with MTYPE-selection in
AMDGPU_VA_OP_MAP, the coherence-model-based MTYPE selection is only
applied if it chooses an MTYPE other than MTYPE_NC (the default).

Add two AMDGPU_GEM_CREATE_... flags to indicate the coherence model. The
default if no flag is specified is non-coherent (i.e. coarse-grained
coherent at dispatch boundaries).

Update amdgpu_amdkfd_gpuvm.c to use this new method to choose the
correct MTYPE depending on the current memory location.

v2:
* check that bo is not NULL (e.g. PRT mappings)
* Fix missing ~ bitmask in gmc_v11_0.c
v3:
* squash in "drm/amdgpu: Inherit coherence flags on dmabuf import"

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 876552e5 07-Sep-2022 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Update PTE flags with TF enabled

This patch updates the PTE flags when translate further (TF) is
enabled:
- With translate_further enabled, invalid PTEs can be 0. Reading
consecutive invalid PTEs as 0 is considered a fault. To prevent
this, ensure invalid PTEs have at least 1 bit set.
- The current invalid PTE flags settings to translate a retry fault
into a no-retry fault, doesn't work with TF enabled. As a result,
update invalid PTE flags settings which works for both TF enabled
and disabled case.

Fixes: 352e683b72e79d ("drm/amdgpu: Enable translate_further to extend UTCL2 reach")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 37a0bad6 07-Sep-2022 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Update PTE flags with TF enabled

This patch updates the PTE flags when translate further (TF) is
enabled:
- With translate_further enabled, invalid PTEs can be 0. Reading
consecutive invalid PTEs as 0 is considered a fault. To prevent
this, ensure invalid PTEs have at least 1 bit set.
- The current invalid PTE flags settings to translate a retry fault
into a no-retry fault, doesn't work with TF enabled. As a result,
update invalid PTE flags settings which works for both TF enabled
and disabled case.

Fixes: 352e683b72e79d ("drm/amdgpu: Enable translate_further to extend UTCL2 reach")
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 373008bf 10-Aug-2022 Dusica Milinkovic <Dusica.Milinkovic@amd.com>

drm/amdgpu: Increase tlb flush timeout for sriov

[Why]
During multi-vf executing benchmark (Luxmark) observed kiq error timeout.
It happenes because all of VFs do the tlb invalidation at the same time.
Although each VF has the invalidate register set, from hardware side
the invalidate requests are queue to execute.

[How]
In case of 12 VF increase timeout on 12*100ms

Signed-off-by: Dusica Milinkovic <Dusica.Milinkovic@amd.com>
Acked-by: Shaoyun Liu <shaoyun.liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 352e683b 04-Aug-2022 Joseph Greathouse <Joseph.Greathouse@amd.com>

drm/amdgpu: Enable translate_further to extend UTCL2 reach

Enable translate_further on Arcturus and Aldebaran server chips
in order to increase the UTCL2 reach from 8 GiB to 64 GiB,
which is more in line with the amount of framebuffer DRAM in
the devices.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Kent Russell <kent.russell@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 25faeddc 25-Mar-2022 Evan Quan <evan.quan@amd.com>

drm/amdgpu: expand cg_flags from u32 to u64

With this, we can support more CG flags.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b818a5d3 09-Mar-2022 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc: use PCI BARs for APUs in passthrough

If the GPU is passed through to a guest VM, use the PCI
BAR for CPU FB access rather than the physical address of
carve out. The physical address is not valid in a guest.

v2: Fix HDP handing as suggested by Michel

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2d505453 15-Mar-2022 Guchun Chen <guchun.chen@amd.com>

drm/amdgpu: conduct a proper cleanup of PDB bo

Use amdgpu_bo_free_kernel instead of amdgpu_bo_unref to
perform a proper cleanup of PDB bo.

v2: update subject to be more accurate

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 24bf9fd1 20-Feb-2022 Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>

drm/amdgpu: Set correct DMA mask for aldebaran

Aldebaran has 48-bit physical address support

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 80e0c2cb 17-Feb-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Remove redundant .ras_fini initialization in some ras blocks

1. Define amdgpu_ras_block_late_fini_default in amdgpu_ras.c as
.ras_fini common function, which is called when
.ras_fini of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_fini to
initialize .ras_fini in ras blocks.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0dca257d 14-Feb-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block

Remove redundant calls of amdgpu_ras_block_late_fini in umc ras block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9dad47c5 14-Feb-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block

Remove redundant calls of amdgpu_ras_block_late_fini in mmhub ras block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 418abce2 14-Feb-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Remove redundant .ras_late_init initialization in some ras blocks

1. Define amdgpu_ras_block_late_init_default in amdgpu_ras.c as
.ras_late_init common function, which is called when
.ras_late_init of ras block isn't initialized.
2. Remove the code of using amdgpu_ras_block_late_init to
initialize .ras_late_init in ras blocks.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 068001b71 13-Feb-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Remove redundant calls of ras_late_init in mmhub ras block

Remove redundant calls of ras_late_init in mmhub ras block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a3ace75c 07-Feb-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Optimize amdgpu_umc_ras_late_init/amdgpu_umc_ras_fini function code

Optimize amdgpu_umc_ras_late_init/amdgpu_umc_ras_fini function code.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cb9561d0 07-Feb-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Optimize amdgpu_mmhub_ras_late_init/amdgpu_mmhub_ras_fini function code

Optimize amdgpu_mmhub_ras_late_init/amdgpu_mmhub_ras_fini function code.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 634b56b0 07-Feb-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Optimize amdgpu_hdp_ras_late_init/amdgpu_hdp_ras_fini function code

Optimize amdgpu_hdp_ras_late_init/amdgpu_hdp_ras_fini function code.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bdb3489c 30-Jan-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Optimize xxx_ras_late_init/xxx_ras_late_fini for each ras block

1. Define amdgpu_ras_block_late_init to create sysfs nodes
and interrupt handles.
2. Define amdgpu_ras_block_late_fini to remove sysfs nodes
and interrupt handles.
3. Replace ras block variable members in struct
amdgpu_ras_block_object with struct ras_common_if, which
can make it easy to associate each ras block instance
with each ras block functional interface.
4. Add .ras_cb to struct amdgpu_ras_block_object.
5. Change each ras block to fit for the changement of struct
amdgpu_ras_block_object.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d0fb18b5 19-Jan-2022 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Move reset sem into reset_domain

We want single instance of reset sem across all
reset clients because in case of XGMI we should stop
access cross device MMIO because any of them could be
in a reset in the moment.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://www.spinics.net/lists/amd-gfx/msg74117.html


# 498d46fe 19-Jan-2022 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: increase bad page number for umc ras query

One piece of umc normalizing address can be mapped to 16 pieces of
physical address in each umc channel on ALDEBARAN.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1f33bd18 18-Jan-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Move xgmi ras initialization from .late_init to .early_init

Move xgmi ras initialization from .late_init to .early_init, which let
xgmi ras can be initialized only once.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1b08dfb8 17-Jan-2022 Christian König <christian.koenig@amd.com>

drm/amdgpu: remove gart.ready flag

That's just a leftover from old radeon days and was preventing CS and GART
bindings before the hardware was initialized. But nowdays that is
perfectly valid.

The only thing we need to warn about are GART binding before the table
is even allocated.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 479e3b02 16-Jan-2022 Xiaojian Du <Xiaojian.Du@amd.com>

drm/amdgpu: add vram check function for GMC

This patch will add vram check function for GMC block.
It will write pattern data to the vram and then read back from the vram,
so that to verify the work status of vram.
This patch will cover gmc v6/7/8/9/10.

Signed-off-by: Xiaojian Du <Xiaojian.Du@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d622c094 13-Jan-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Fix the code style warnings in gmc

Fix the code style warnings in gmc:
ERROR: space required after that ',' (ctx:VxV).

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# efe17d5a 05-Jan-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Modify umc block to fit for the unified ras block data and ops

1.Modify umc block to fit for the unified ras block data and ops.
2.Change amdgpu_umc_ras_funcs to amdgpu_umc_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of umc ras variable so that umc ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register umc ras block into amdgpu device ras block link list.
5.Remove the redundant code about umc in amdgpu_ras.c after using the unified ras block.
6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of umc versions. If .ras_late_init and .ras_fini had been defined by the selected umc version, the defined functions will take effect; if not defined, default fill them with amdgpu_umc_ras_late_init and amdgpu_umc_ras_fini.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5e67bba3 04-Jan-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Modify mmhub block to fit for the unified ras block data and ops

1.Modify mmhub block to fit for the unified ras block data and ops.
2.Change amdgpu_mmhub_ras_funcs to amdgpu_mmhub_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of mmhub ras variable so that mmhub ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register mmhub ras block into amdgpu device ras block link list. 5.Remove the redundant code about mmhub in amdgpu_ras.c after using the unified ras block.
5.Remove the redundant code about mmhub in amdgpu_ras.c after using the unified ras block.
6.Fill unified ras block .name .block .ras_late_init and .ras_fini for all of mmhub versions. If .ras_late_init and .ras_fini had been defined by the selected mmhub version, the defined functions will take effect; if not defined, default fill them with amdgpu_mmhub_ras_late_init and amdgpu_mmhub_ras_fini.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6d76e904 04-Jan-2022 yipechai <YiPeng.Chai@amd.com>

drm/amdgpu: Modify hdp block to fit for the unified ras block data and ops

1.Modify hdp block to fit for the unified ras block data and ops.
2.Change amdgpu_hdp_ras_funcs to amdgpu_hdp_ras, and the corresponding variable name remove _funcs suffix.
3.Remove the const flag of hdp ras variable so that hdp ras block can be able to be inserted into amdgpu device ras block link list.
4.Invoke amdgpu_ras_register_ras_block function to register hdp ras block into amdgpu device ras block link list.
5.Remove the redundant code about hdp in amdgpu_ras.c after using the unified ras block.

Signed-off-by: yipechai <YiPeng.Chai@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# dc5d4aff 04-Jan-2022 Harry Wentland <harry.wentland@amd.com>

drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2

For some reason this file isn't using the appropriate register
headers for DCN headers, which means that on DCN2 we're getting
the VIEWPORT_DIMENSION offset wrong.

This means that we're not correctly carving out the framebuffer
memory correctly for a framebuffer allocated by EFI and
therefore see corruption when loading amdgpu before the display
driver takes over control of the framebuffer scanout.

Fix this by checking the DCE_HWIP and picking the correct offset
accordingly.

Long-term we should expose this info from DC as GMC shouldn't
need to know about DCN registers.

Cc: stable@vger.kernel.org
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 575e55ee 07-Jan-2022 Nirmoy Das <nirmoy.das@amd.com>

drm/amdgpu: recover gart table at resume

Get rid off pin/unpin of gart BO at resume/suspend and
instead pin only once and try to recover gart content
at resume time. This is much more stable in case there
is OOM situation at 2nd call to amdgpu_device_evict_resources()
while evicting GART table.

v3: remove gart recovery from other places
v2: pin gart at amdgpu_gart_table_vram_alloc()

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4a0165f0 15-Dec-2021 Victor Skvortsov <victor.skvortsov@amd.com>

drm/amdgpu: get xgmi info before ip_init

Driver needs to call get_xgmi_info() before ip_init
to determine whether it needs to handle a pending hive reset.

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: David Nieto <david.nieto@amd.com>
Reviewed by: shaoyun.liu <Shaoyun.lui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 92f153bb 13-Dec-2021 Victor Skvortsov <victor.skvortsov@amd.com>

drm/amdgpu: Modify indirect register access for gmc_v9_0 sriov

Modify GC register access from MMIO to RLCG if the
indirect flag is set

v2: Replaced ternary operator with if-else for better
readability

Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com>
Reviewed-by: David Nieto <david.nieto@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 948e7ce0 13-Dec-2021 Jingwen Chen <Jingwen.Chen2@amd.com>

drm/amd/amdgpu: fix gmc bo pin count leak in SRIOV

[Why]
gmc bo will be pinned during loading amdgpu and reset in SRIOV while
only unpinned in unload amdgpu

[How]
add amdgpu_in_reset and sriov judgement to skip pin bo

v2: fix wrong judgement

Signed-off-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Reviewed-by: Horace Chen <horace.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 17252701 12-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amdgpu: correct the wrong cached state for GMC on PICASSO

Pair the operations did in GMC ->hw_init and ->hw_fini. That
can help to maintain correct cached state for GMC and avoid
unintention gate operation dropping due to wrong cached state.

BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 47d9c6fa 08-Dec-2021 chiminghao <chi.minghao@zte.com.cn>

drm:amdgpu:remove unneeded variable

return value form directly instead of
taking this in another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cm>
Signed-off-by: chiminghao <chi.minghao@zte.com.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cace4bff 25-Nov-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: check df_funcs and its callback pointers

in case they are not avaiable in early phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3c2d6ea2 18-Nov-2021 Philip Yang <Philip.Yang@amd.com>

drm/amdgpu: handle IH ring1 overflow

IH ring1 is used to process GPU retry fault, overflow is enabled to
drain retry fault because we want receive other interrupts while
handling retry fault to recover range. There is no overflow flag set
when wptr pass rptr. Use timestamp of rptr and wptr to handle overflow
and drain retry fault.

If fault timestamp goes backward, the fault is filtered and should not
be processed. Drain fault is finished if processed_timestamp is equal to
or larger than checkpoint timestamp.

Add amdgpu_ih_functions interface decode_iv_ts for different chips to
get timestamp from IV entry with different iv size and timestamp offset.
amdgpu_ih_decode_iv_ts_helper is used for vega10, vega20, navi10.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 17c65d6f 12-Dec-2021 Evan Quan <evan.quan@amd.com>

drm/amdgpu: correct the wrong cached state for GMC on PICASSO

Pair the operations did in GMC ->hw_init and ->hw_fini. That
can help to maintain correct cached state for GMC and avoid
unintention gate operation dropping due to wrong cached state.

BugLink: https://gitlab.freedesktop.org/drm/amd/-/issues/1828

Signed-off-by: Evan Quan <evan.quan@amd.com>
Acked-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 630e959f 01-Oct-2021 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: convert to IP version checking

Use IP versions rather than asic_type to differentiate
IP version specific features.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 71cf9e72 23-Sep-2021 Leslie Shi <Yuliang.Shi@amd.com>

drm/amdgpu: fix gart.bo pin_count leak

gmc_v{9,10}_0_gart_disable() isn't called matched with
correspoding gart_enbale function in SRIOV case. This will
lead to gart.bo pin_count leak on driver unload.

Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 66805763 23-Sep-2021 Leslie Shi <Yuliang.Shi@amd.com>

drm/amdgpu: fix gart.bo pin_count leak

gmc_v{9,10}_0_gart_disable() isn't called matched with
correspoding gart_enbale function in SRIOV case. This will
lead to gart.bo pin_count leak on driver unload.

Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Leslie Shi <Yuliang.Shi@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ff891a2e 15-Aug-2021 Philip Yang <Philip.Yang@amd.com>

drm/amdkfd: check access permisson to restore retry fault

Check range access permission to restore GPU retry fault, if GPU retry
fault on address which belongs to VMA, and VMA has no read or write
permission requested by GPU, failed to restore the address. The vm fault
event will pass back to user space.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3907c492 23-Aug-2021 John Clements <john.clements@amd.com>

drm/amdgpu: Add driver infrastructure for MCA RAS

Add MCA specific IP blocks targetting RAS features

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 719e433e 23-Jul-2021 Mukul Joshi <mukul.joshi@amd.com>

drm/amdgpu: Fix channel_index table layout for Aldebaran

Fix the channel_index table layout to fetch the correct
channel_index when calculating physical address from
normalized address during page retirement.
Also, fix the number of UMC instances and number of channels
within each UMC instance for Aldebaran.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-By: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 308ef2ad 13-Jul-2021 John Clements <john.clements@amd.com>

drm/amdgpu: Resolve bug in UMC 6.7 error offset calculation

Use correct channel and instance values

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 186c8a85 09-Jun-2021 John Clements <john.clements@amd.com>

drm/amdgpu: initialize umc ras function

support umc ras function initialization for aldebaran

v2: squash in compile fix

Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# adbe2e3d 02-Jun-2021 Zhigang Luo <zhigang.luo@amd.com>

drm/amdgpu: remove sriov vf checking from getting fb location

host driver programmed fb location registers for vf, no need to
check anymore.

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-By : Shaoyun.liu <shaoyunl@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d10d0daa 17-May-2021 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Handle IOMMU enabled case.

Problem:
Handle all DMA IOMMU group related dependencies before the
group is removed. Those manifest themself in that when IOMMU
enabled DMA map/unmap is dependent on the presence of IOMMU
group the device belongs to but, this group is released once
the device is removed from PCI topology.

Fix:
Expedite all such unmap operations to pci remove driver callback.

v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate
v6: Drop the BO unamp list
v7:
Drop amdgpu_gart_fini
In amdgpu_ih_ring_fini do uncinditional check (!ih->ring)
to avoid freeing uniniitalized rings.
Call amdgpu_ih_ring_fini unconditionally.
v8: Add deatiled explanation

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210517143851.475058-1-andrey.grodzovsky@amd.com


# 8f6368a9 17-May-2021 John Clements <john.clements@amd.com>

drm/amdgpu: Conditionally reset RAS counters on boot

Only clear RAS error counters if perestent EDC harvesting is not supported

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8ab0d6f0 04-May-2021 Luben Tuikov <luben.tuikov@amd.com>

drm/amdgpu: Rename to ras_*_enabled

Rename,
ras_hw_supported --> ras_hw_enabled, and
ras_features --> ras_enabled,
to show that ras_enabled is a subset of
ras_hw_enabled, which itself is a subset
of the ASIC capability.

Cc: Alexander Deucher <Alexander.Deucher@amd.com>
Cc: John Clements <john.clements@amd.com>
Cc: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 78871b6c 28-Apr-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: enable ras error count query and reset for HDP

add hdp block ras error query and reset support in
amdgpu ras error count query and reset interface

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6f12507f 28-Apr-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: initialize hdp v4_0 ras functions

hdp v4_0 support ras features

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7845d80d 16-Apr-2021 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: remove dummy read workaround for newer chips

Aldebaran has a hw fix so no longer requires the workaround.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0ca565ab 01-Apr-2021 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Calling address translation functions to simplify codes

Use amdgpu_gmc_vram_pa and amdgpu_gmc_vram_cpu_pa
to simplify codes. No logic change.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d844c6d7 06-Apr-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: move mmhub ras_func init to ip specific file

mmhub ras is always owned by gpu driver. ras_funcs
initialization shall be done at ip level, instead of
putting it in common gmc interface file

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8bc7b360 19-Mar-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: split mmhub callbacks into ras and non-ras ones

mmhub ras is only avaiable in cerntain mmhub ip
generation.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 68d705dd 17-Mar-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: do not register df_mca interrupt in certain config

df/mca ras is not managed by gpu driver when gpu
is connected to cpu through xgmi. gpu driver should
register x86 mca notifier for umc ras error
notification

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 49070c4e 17-Mar-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: split umc callbacks to ras and non-ras ones

umc ras is not managed by gpu driver when gpu is
connected to cpu through xgmi. split umc callbacks
into ras and non-ras ones so gpu driver only
initializes umc ras callbacks when it manages
umc ras.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 52137ca8 18-Mar-2021 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: move xgmi ras functions to xgmi_ras_funcs

xgmi ras is not managed by gpu driver when gpu is
connected to cpu through xgmi. move all xgmi ras
functions to xgmi_ras_funcs so gpu driver only
initializes xgmi ras functions when it manages
xgmi ras.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 37c49ded 10-Mar-2021 Lijo Lazar <lijo.lazar@amd.com>

drm/amdgpu: Free PDB0 bo before bo_fini

Cleanup pdb0 bo before bo_fini gets called

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 58df0d71 09-Feb-2021 Sebastian Andrzej Siewior <bigeasy@linutronix.de>

drm/amdgpu: Replace in_interrupt() usage in gmc_v*_process_interrupt()

The usage of in_interrupt() in gmc_v*_process_interrupt() is intended to
use a different code path if invoked from the interrupt handler vs
invoked from the workqueue.

The usage of in_interrupt() in drivers is phased out and Linus clearly
requested that code which changes behaviour depending on context should
either be separated or the context be conveyed in an argument passed by the
caller, which usually knows the context.

gmc_v*_process_interrupt() is invoked via the ->process() callback
from amdgpu_ih_process() which in turn is invoked either from
amdgpu_irq_handler() (the interrupt handler) or from
amdgpu_irq_handle_*() which is a workqueue.

amdgpu_irq::ih is always processed from the interrupt handler, the other
three struct amdgpu_ih_ring members are processed from a workqueue.

Replace the in_interrupt() check with a comparison against adev->irq.ih.
A similar check is already done to check if the ih pointer is from
ih_soft.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6dce50b1 09-Feb-2021 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Let KFD use more VMIDs on Aldebaran

When there is no graphics support, KFD can use more of the VMIDs. Graphics
VMIDs are only used for video decoding/encoding and post processing. With
two VCE engines, there is no reason to reserve more than 2 VMIDs for that.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f4ec3e50 01-Feb-2021 Alex Sierra <alex.sierra@amd.com>

drm/amdgpu: update mmhub client ids for Aldebaran

update mmhub client id table for Aldebaran.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 522510a6 17-Sep-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Set up vmid0 PDB0

If use gart for FB translation, allocate and fill
PDB0.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7b454b3a 17-Sep-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Use different gart table parameters for 2-level gart table

If use gart for FB translation, we will squeeze vram into
sysvm aperture. This requires 2 level gart table. Add
page table depth and page table block size parameters
to gmc. This is prepare work to 2-level gart table
construction

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f527f310 15-Sep-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Placement of gart and vram in sysvm aperture

If use GART for FB translation, place both vram and gart to sysvm
aperture. AGP aperture is not set up in this case because it
is not used

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f1dc12ca 02-Oct-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Moved gart_size calculation to mc_init functions

In amdgpu_gmc_gart_location function, gart_size is adjusted
by a smu_prv_buffer_size. This logic shouldn't belong to
this function. Move the logic to the mc_init functions

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e844cd99 05-Jan-2021 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add mmhub client ids for aldebaran

Add the mmhub client id table for aldebaran.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b7daed1b 14-Dec-2020 Amber Lin <Amber.Lin@amd.com>

drm/amdgpu: Aldebaran doesn't use semaphore

Simplify all Aldebaran DIDs into one ASIC type.

Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# be566196 21-Nov-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Don't do FB resize under A+A config

Disable PCIe BAR resizing on A+A config. It's not needed because we won't use the
PCIe BAR, but it breaks the PCI BAR configuration with the current SBIOS.

Error message of FB BAR resize failure under A+A:

[ 154.913731] [drm:amdgpu_device_resize_fb_bar [amdgpu]] *ERROR* Problem resizing BAR0 (-22).

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Amber Lin <Amber.Lin@amd.com>
Reviewed-by: Felix Kuehling <Felix.kuehling@amd.com>
Reviewed-by: Christian Koenig <Christian.Koenig@amd.com>
Tested-by: Amber Lin <Amber.Lin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d477c5aa 23-Nov-2020 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: disallow use semaphore on aldebaran

shall revisit the change later

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3de60d96 17-Nov-2020 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: use physical_node_id to calculate aper_base

Similar as xgmi connected gpu nodes, physical_node_id
* segment_size should be used to calculate the offset
of aper_base.

The asic type check is redundant. once physical_node_id
and segment_size are initialized, it should be count
on.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4da999cd 11-Aug-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Clean up mmhub functions for aldebaran

Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC.

V2: Split patch into upstreamable and aldebaran

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 72b4db0f 05-May-2020 Eric Huang <jinhuieric.huang@amd.com>

drm/amdgpu: new cache coherence change for Aldebaran

To support new cache coherence HW on A+A platform mainly in KFD.

Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7ffe7238 26-May-2020 Yong Zhao <Yong.Zhao@amd.com>

drm/amdgpu: Fix an omission when adding Aldebaran support

Aldebaran should be the same as Arcturus in the PTE SNOOPED bit handling.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 31691b8d 21-Oct-2020 Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>

drm/amdgpu: define address map for host xgmi link (v3)

This applies to AMD Accelerated Processing Platforms that support host
gpu interconnect throguh a special link (xgmi). Aldebaran systems will
support this special feature for utilizing the benefits of host-gpu
cache coherence. This change outlines the basic framework for mapping
the GPU VRAM (HBM) to system address space making it accesible to the
host but managed by the amdgpu driver since this region is marked as
reserved memory in host address space by the underlying system firmware.

v2: switch to smuio callback function to check the type
of host-gpu interface (Hawking)
v3: use hub callbacks rather than direct function calls (Alex)

Reviewed-by: Oak Zeng <oak.zeng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# be14729a 21-Aug-2020 Yong Zhao <Yong.Zhao@amd.com>

drm/amdgpu: Print the IH client ID name when vm fault happens

This gives more information and improves productivity.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 85e39550 12-Nov-2019 Le Ma <le.ma@amd.com>

drm/amdgpu: add gmc v9 block support for Aldebaran

Add gfx memory controller support

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e83db774 02-Feb-2021 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: fix mmhub client mapping for arcturus

The hw interface changed on arcturus so the old numbering
scheme doesn't work.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9ca0674a 28-Dec-2020 Likun Gao <Likun.Gao@amd.com>

drm/amdgpu: remove redundant logic related HDP

Remove hdp_flush function from amdgpu_nbio struct as it have been unified
into hdp struct.
Remove the include about hdp register which was not used.
V2: Remove hdp golden setting which is unnecessary.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 455d40c9 28-Dec-2020 Likun Gao <Likun.Gao@amd.com>

drm/amdgpu: switch hdp callback functions for hdp v4

Switch to use the HDP functions which unified on hdp structure instead of
the scattered hdp callback functions.
V2: clean up hdp reset ras error count function.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d0f2f634 21-Nov-2020 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: remove unnecessary asic type check

The number of crtc should be 0 for ASICs that don't
have display engine. Remove the unnecessary asic type
check then.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 64f2c158 04-Dec-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: remove amdgpu_ttm_late_init and amdgpu_bo_late_init

No longer used.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bf0df09c 23-Nov-2020 Lee Jones <lee.jones@linaro.org>

drm/amd/amdgpu/gmc_v9_0: Suppy some missing function doc descriptions

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:382:23: warning: ‘ecc_umc_mcumc_status_addrs’ defined but not used [-Wunused-const-variable=]
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:720: warning: Function parameter or member 'vmhub' not described in 'gmc_v9_0_flush_gpu_tlb'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:836: warning: Function parameter or member 'flush_type' not described in 'gmc_v9_0_flush_gpu_tlb_pasid'
drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:836: warning: Function parameter or member 'all_hub' not described in 'gmc_v9_0_flush_gpu_tlb_pasid'

Acked-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fecf491a 23-Nov-2020 Lee Jones <lee.jones@linaro.org>

drm/amd/amdgpu/gmc_v9_0: Remove unused table 'ecc_umc_mcumc_status_addrs'

Fixes the following W=1 kernel build warning(s):

drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c:382:23: warning: ‘ecc_umc_mcumc_status_addrs’ defined but not used [-Wunused-const-variable=]

Acked-by: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0291150d 31-Oct-2020 Christian König <christian.koenig@amd.com>

drm/amdgpu: make sure retry faults are handled in a work item on Vega

Looks like we can't enabled the IH1/IH2 feature for Vega20, make sure
retry faults are handled on a separate ring anyway.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 18e1a6c9 03-Nov-2020 Christian König <christian.koenig@amd.com>

drm/amdgpu: drop leading zeros from the gmc9 fault address

The address space is only 48bit, not 64bit. And the VMHUBs work with
sign extended addresses.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9304ca4d 19-Nov-2020 Gustavo A. R. Silva <gustavoars@kernel.org>

drm/amdgpu: Fix fall-through warnings for Clang

In preparation to enable -Wimplicit-fallthrough for Clang, fix multiple
warnings by explicitly adding multiple break statements instead of just
letting the code fall through to the next case.

Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e3898719 28-Oct-2020 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup gmc_v9_0_process_interrupt

First of all don't snprintf into a char buffer allocated on the stack with
a constant hubname.

Then cleanup to exit the function early in case of a ratelimit or SRIOV.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 21470d97 14-Oct-2020 Kevin Wang <kevin1.wang@amd.com>

drm/amdgpu: remove gfxhub_v1_1_funcs set

remove duplicate gfxhub v1.1 function set.
put function of gfxhub_v1_1_get_xgmi_info to gfxhub v1_0 function set.

Signed-off-by: Kevin Wang <kevin1.wang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4a20300b 28-Sep-2020 Guchun Chen <guchun.chen@amd.com>

drm/amdgpu: drop duplicated ecc check for vega10 (v5)

The same ECC check has been executed in amdgpu_ras_init for vega10,
prior to gmc_v9_0_late_init.

v2: drop all atombios helper callings
v3: use bit operation
v4: correct inline comment, remove parity check statement
v5: squash in build fix

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8ffff9b4 17-Sep-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: use function pointer for gfxhub functions

gfxhub functions are now called from function pointers,
instead of from asic-specific functions.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c24a3c05 21-Sep-2020 Liu Shixin <liushixin2@huawei.com>

drm/amdgpu/gmc9: simplify the return expression of gmc_v9_0_suspend

Simplify the return expression.

Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0eaa8012 13-Sep-2020 Shirish S <shirish.s@amd.com>

amdgpu/gmc_v9: Warn if SDPIF_MMIO_CNTRL_0 is not set

With IOMMU enabled, if SDPIF_MMIO_CNTRL_0 is not set
appropriately the system hangs without any trace
during S3.

To ease debug and to ensure that the failure, if any,
was caused by a race conditions that disabled write access to
SDPIF_MMIO_CNTRL_0 register, warn the user about it.

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f4075be8 14-Sep-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: remove mmhub client duplicated case

Copy paste typo.

Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 02f23f5f 02-Sep-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: print client id string for mmhub

Print the name of the client rather than the number. This
makes it easier to debug what block is causing the fault.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# be99ecbf 01-Sep-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: print client id string for gfxhub

Print the name of the client rather than the number. This
makes it easier to debug what block is causing the fault.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 81202807 31-Aug-2020 Dennis Li <Dennis.Li@amd.com>

drm/amdgpu: block ring buffer access during GPU recovery

When GPU is in reset, its status isn't stable and ring buffer also need
be reset when resuming. Therefore driver should protect GPU recovery
thread from ring buffer accessed by other threads. Otherwise GPU will
randomly hang during recovery.

v2: correct indent

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b0a2db9b 19-Aug-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add pre_asic_init callback for SOC15

We need to restore some registers prior to running asic
init to work around a firmware bug.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f8646661 18-Aug-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: fix up DCHUBBUB_SDPIF_MMIO_CNTRL_0 handling

Properly define this register using a relative offset rather
than an absolute offset and use the proper SOC15 macros to
access it. It's also DCN, not DCE, so remove it from the
DCE12 header.

No functional change.

Acked-by: Nirmoy Das <nirmoy.das@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# aac89168 19-Aug-2020 Dennis Li <Dennis.Li@amd.com>

drm/amdgpu: refine message print for devices of hive

Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device
of a hive is the message for.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 53b3f8f4 19-Aug-2020 Dennis Li <Dennis.Li@amd.com>

drm/amdgpu: refine codes to avoid reentering GPU recovery

if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive->in_reset
and adev->in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# df561f66 23-Aug-2020 Gustavo A. R. Silva <gustavoars@kernel.org>

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>


# f1403342 12-Aug-2020 Christian König <christian.koenig@amd.com>

drm/amdgpu: revert "fix system hang issue during GPU reset"

The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9fb1506e 06-Aug-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Use function pointer for some mmhub functions

Add more function pointers to amdgpu_mmhub_funcs. ASIC specific
implementation of most mmhub functions are called from a general
function pointer, instead of calling different function for
different ASIC. Simplify the code by deleting duplicate functions

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7b885f0e 28-Jul-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: switch to using amdgpu_gmc_get_vbios_allocations

The new helper centralizes the logic in one place.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5db62dc8 28-Jul-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move keep stolen memory check into gmc core

Rather than leaving this as a gmc v9 specific hack.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fcbc92e2 28-Jul-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move stolen vga bo from amdgpu to amdgpu.gmc

Since that is where we store the other data related to
the stolen vga memory.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 81b54fb7 28-Jul-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: use a define for the memory size of the vga emulator

Rather than open coding it everywhere.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# df9c8d1a 08-Jul-2020 Dennis Li <Dennis.Li@amd.com>

drm/amdgpu: fix system hang issue during GPU reset

when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev->in_gpu_reset and hive->in_reset are used to avoid
re-entering GPU recovery.

During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev->reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.

v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm->is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.

v3:
1. change back to use adev->reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;

[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249] dump_stack+0x98/0xd5
[ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833] free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831] ksys_ioctl+0x98/0xb0
[ 1230.204004] __x64_sys_ioctl+0x1a/0x20
[ 1230.205174] do_syscall_64+0x5f/0x250
[ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe

2. remove try_lock and introduce atomic hive->in_reset, to avoid
re-enter GPU recovery.

v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset

v5:
1. Fix some style issues.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com>
Suggested-by: Luben Tukov <luben.tuikov@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 148f597d 01-Jul-2020 Huang Rui <ray.huang@amd.com>

drm/amdgpu: use register distance member instead of hardcode in GMC9

This patch updates to use register distance member instead of hardcode
in GMC9.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Tested-by: AnZhong Huang <anzhong.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 81659b20 24-Jun-2020 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Let KFD use more VMIDs on Arcturus

When there is no graphics support, KFD can use more of the VMIDs. Graphics
VMIDs are only used for video decoding/encoding and post processing. With
two VCE engines, there is no reason to reserve more than 2 VMIDs for that.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 40111ec2 24-Jun-2020 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Clean up KFD VMID assignment

The KFD VMID assignment was hard-coded in a few places. Consolidate that in
a single variable adev->vm_manager.first_kfd_vmid. The value is still
assigned in gmc-ip-version-specific code.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 54f78a76 15-May-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add apu flags (v2)

Add some APU flags to simplify handling of different APU
variants. It's easier to understand the special cases
if we use names flags rather than checking device ids and
silicon revisions.

v2: rebase on latest code

Acked-by: Evan Quan <evan.quan@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# abb17b1e 24-Apr-2020 Colin Ian King <colin.king@canonical.com>

drm/amdgpu/gmc: Use consistent variable on unlocks

Currently the error returns paths are unlocking lock kiq->ring_lock
however it seems this should be dev->gfx.kiq.ring_lock as this
is the lock that is being locked and unlocked around the ring
operations. This looks like a bug, but it's not. The kiq is just
a local variable pointing to the same structure. Make it consistent.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 04e4e2e9 22-Apr-2020 Yintian Tao <yttao@amd.com>

drm/amdgpu: protect ring overrun

Wait for the oldest sequence on the ring
to be signaled in order to make sure there
will be no command overrun.

v2: fix coding stype and remove abs operation
v3: remove the initialization of variable r

Signed-off-by: Yintian Tao <yttao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d2155a71 05-Apr-2020 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Print UTCL2 client ID on a gpuvm fault

UTCL2 client ID is useful information to get which
UTCL2 client caused the gpuvm fault. Print it out
for debug purpose

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 88474cca 10-Mar-2020 Guchun Chen <guchun.chen@amd.com>

drm/amdgpu: update ras capability's query based on mem ecc configuration

RAS support capability needs to be updated on top of different
memeory ECC enablement, and remove redundant memory ecc check
in gmc module for vega20 and arcturus.

v2: check HBM ECC enablement and set ras mask accordingly.
v3: avoid to invoke atomfirmware interface to query twice.

Suggested-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fe5211f1 01-Mar-2020 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: add reset_ras_error_count function for MMHUB

MMHUB ras error counters are dirty ones after cold reboot
Read operation is needed to reset them to 0

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a3ed353c 27-Jan-2020 Shirish S <shirish.s@amd.com>

amdgpu/gmc_v9: save/restore sdpif regs during S3

fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# b80cd524 17-Jan-2020 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Improve Vega20 XGMI TLB flush workaround

Using a heavy-weight TLB flush once is not sufficient. Concurrent
memory accesses in the same TLB cache line can re-populate TLB entries
from stale texture cache (TC) entries while the heavy-weight TLB
flush is in progress. To fix this race condition, perform another TLB
flush after the heavy-weight one, when TC is known to be clean.

Move the workaround into the low-level TLB flushing functions. This way
they apply to amdgpu as well, and KIQ-based TLB flush only needs to
synchronize once.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Reviewed-by: shaoyun liu <shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c2ecd79b 27-Jan-2020 Shirish S <shirish.s@amd.com>

amdgpu/gmc_v9: save/restore sdpif regs during S3

fixes S3 issue with IOMMU + S/G enabled @ 64M VRAM.

Suggested-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fa34edbe 17-Jan-2020 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Use the correct flush_type in flush_gpu_tlb_pasid

The flush_type was incorrectly hard-coded to 0 when calling falling back
to MMIO-based invalidation in flush_gpu_tlb_pasid.

Fixes: ea930000a6dc ("drm/amdgpu: export function to flush TLB via pasid")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 37c58ddf 17-Jan-2020 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Fix TLB invalidation request when using semaphore

Use a more meaningful variable name for the invalidation request
that is distinct from the tmp variable that gets overwritten when
acquiring the invalidation semaphore.

Fixes: 4ed8a03740d0 ("drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 36a1707a 13-Jan-2020 Alex Sierra <alex.sierra@amd.com>

drm/amdgpu: modify packet size for pm4 flush tlbs

[Why]
PM4 packet size for flush message was oversized.

[How]
Packet size adjusted to allocate flush + fence packets.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ea930000 19-Dec-2019 Alex Sierra <alex.sierra@amd.com>

drm/amdgpu: export function to flush TLB via pasid

This can be used directly from amdgpu and amdkfd to invalidate
TLB through pasid.
It supports gmc v7, v8, v9 and v10.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bdf84a80 14-Jan-2020 Joseph Greathouse <Joseph.Greathouse@amd.com>

drm/amdgpu: Create generic DF struct in adev

The only data fabric information the adev struct currently
contains is a function pointer table. In the near future,
we will be adding some cached DF information into adev. As
such, this patch creates a new amdgpu_df struct for adev.
Right now, it only containst the old function pointer table,
but new stuff will be added soon.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bdbe90f0 06-Jan-2020 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc: move invaliation bitmap setup to common code

So it can be shared with newer GMC versions.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2ee9403e 10-Dec-2019 Zhigang Luo <zhigang.luo@amd.com>

drm/amd/amdgpu: L1 Policy(3/5) - removed ECC interrupt from VF

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Jane Jian <jane.jian@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 08546895 02-Dec-2019 Zhigang Luo <zhigang.luo@amd.com>

drm/amd/amdgpu: L1 Policy(2/5) - removed GC GRBM violations from gfxhub

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Jane Jian <jane.jian@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 20bf2f6f 14-Nov-2019 Zhigang Luo <zhigang.luo@amd.com>

drm/amd/amdgpu: L1 Policy(1/5) - removed VM settings for mmhub and gfxhub from VF

Signed-off-by: Zhigang Luo <zhigang.luo@amd.com>
Signed-off-by: Jane Jian <jane.jian@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1e2c6d55 20-Dec-2019 John Clements <john.clements@amd.com>

drm/amdgpu: Added ASIC specific check in gmc v9.0 ECC interrupt programming sequence

Devices newer then VEGA10/12 shall have these programming sequences performed by PSP BL

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 90f6452c 10-Dec-2019 changzhu <Changfeng.Zhu@amd.com>

drm/amdgpu: add invalidate semaphore limit for SRIOV and picasso in gmc9

It may fail to load guest driver in round 2 or cause Xstart problem
when using invalidate semaphore for SRIOV or picasso. So it needs avoid
using invalidate semaphore for SRIOV and picasso.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 413fc385 09-Dec-2019 changzhu <Changfeng.Zhu@amd.com>

drm/amdgpu: avoid using invalidate semaphore for picasso

It may cause timeout waiting for sem acquire in VM flush when using
invalidate semaphore for picasso. So it needs to avoid using invalidate
semaphore for piasso.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4cf781c2 10-Dec-2019 John Clements <john.clements@amd.com>

drm/amdgpu: Added RAS UMC error query support for Arcturus

Updated UMC 6.1 function set to support UMC 6.1.1 and 6.1.2 devices

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 418899d6 09-Dec-2019 changzhu <Changfeng.Zhu@amd.com>

drm/amdgpu: avoid using invalidate semaphore for picasso

It may cause timeout waiting for sem acquire in VM flush when using
invalidate semaphore for picasso. So it needs to avoid using invalidate
semaphore for piasso.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f920d1bb 18-Nov-2019 changzhu <Changfeng.Zhu@amd.com>

drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10

It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.

After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 4ed8a037 18-Nov-2019 changzhu <Changfeng.Zhu@amd.com>

drm/amdgpu: invalidate mmhub semaphore workaround in gmc9/gmc10

It may lose gpuvm invalidate acknowldege state across power-gating off
cycle. To avoid this issue in gmc9/gmc10 invalidation, add semaphore acquire
before invalidation and semaphore release after invalidation.

After adding semaphore acquire before invalidation, the semaphore
register become read-only if another process try to acquire semaphore.
Then it will not be able to release this semaphore. Then it may cause
deadlock problem. If this deadlock problem happens, it needs a semaphore
firmware fix.

Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f6c3623b 18-Nov-2019 Dennis Li <Dennis.Li@amd.com>

drm/amdgpu: implement querying ras error count for mmhub9.4

Get mmhub error counter by accessing EDC_CNT registers.

v2: Add mmhub_v9_4_ prefix for local static variable and function

Signed-off-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9e612c11 13-Nov-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: init umc functions for arcturus umc ras

reuse vg20 umc functions for arcturus umc ras

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f81b86a0 07-Oct-2019 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Enable gfx cache probing on HDP write for arcturus

This allows gfx cache to be probed and invalidated (for none-dirty cache lines)
on a HDP write (from either another GPU or CPU). This should work only for the
memory mapped as RW memory type newly added for arcturus, to achieve some cache
coherence b/t multiple memory clients.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cb1545f7 07-Oct-2019 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Clean up gmc_v9_0_gart_enable

Many logic in this function are HDP set up,
not gart set up. Moved those logic to gmc_v9_0_hw_init.
No functional change.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Acked-by: Christian konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ad02e08e 02-Oct-2019 Ori Messinger <Ori.Messinger@amd.com>

drm/amdgpu: Report vram vendor with sysfs (v3)

The vram vendor can be found as a separate sysfs file at:
/sys/class/drm/card[X]/device/mem_info_vram_vendor
The vram vendor is displayed as a string value.

v2: Use correct bit masking, and cache vram_vendor in gmc
v3: Drop unused functions for vram width, type, and vendor

Signed-off-by: Ori Messinger <ori.messinger@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 21889cec 26-Sep-2019 Jack Zhang <Jack.Zhang1@amd.com>

drm/amd/amdgpu/sriov ip block setting of Arcturus

Add ip block setting for Arcturus SRIOV

1.PSP need to be initialized before IH.
2.SMU doesn't need to be initialized at kmd driver.
3.Arcturus doesn't support DCE hardware,it needs to skip
register access to DCE.

Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ba083492 18-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: implement common gmc_ras_late_init

common gmc_ecc_late_init can be shared among all generations of gmc

v2: rename gmc_ecc_late_init to gmc_ras_late_init

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 56c54b25 12-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: remove ih_info parameter of umc_ras_late_init

umc_ras_late_init can get the info by itself

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2adf1344 12-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: add common gmc_ras_fini function

gmc_ras_fini can be shared among all generations of gmc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 65bc47a6 12-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: move mmhub_ras_if from gmc to mmhub block

mmhub_ras_if is relevant to mmhub

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d65bf1f8 12-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: replace mmhub_funcs with mmhub.funcs

remove mmhub_funcs in adev

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 03740baa 12-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: move umc_ras_if from gmc to umc block

umc_ras_if is relevant to umc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 34cc4fd9 11-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: move umc ras irq functions to umc block

move umc ras irq functions from gmc v9 to generic umc block, these
functions are relevant to umc and they can be shared among all
generations of umc

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f5f06e21 11-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: update parameter of ras_ih_cb

change struct ras_err_data *err_data to void *err_data, align with
umc code and the callback's declaration in each ras block could
pay no attention to the structure type

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e7da754b 24-Sep-2019 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu: fix an UMC hw arbitrator bug(v3)

issue:
the UMC6 h/w bug is that when MCLK is doing the switch
in the middle of a page access being preempted by high
priority client (e.g. DISPLAY) then UMC and the mclk switch
would stuck there due to deadlock

how:
fixed by disabling auto PreChg for UMC to avoid high
priority client preempting other client's access on
the same page, thus the deadlock could be avoided

v2:
put the patch in callback of UMC6
v3:
rename the callback to "init_registers"

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <hawking.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 631cdbd2 23-Sep-2019 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/atomfirmware: simplify the interface to get vram info

fetch both the vram type and width in one function call. This
avoids having to parse the same data table twice to get the two
pieces of data.

Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ec671737 07-Dec-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: add graceful VM fault handling v3

Next step towards HMM support. For now just silence the retry fault and
optionally redirect the request to the dummy page.

v2: make sure the VM is not destroyed while we handle the fault.
v3: fix VM destroy check, cleanup comments

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 029fbd43 09-Sep-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: initialize ras structures for xgmi block (v2)

init ras common interface and fs node for xgmi block

v2: remove unnecessary physical node number check before
invoking amdgpu_xgmi_ras_late_init

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 86edcc7d 05-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: move umc late init from gmc to umc block

umc late init is umc specific, it's more suitable to be put in umc block

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cbfae36c 02-Sep-2019 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup PTE flag generation v3

Move the ASIC specific code into a new callback function.

v2: mask the flags for SI and CIK instead of a BUG_ON().
v3: remove last missed BUG_ON().

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 71776b6d 02-Sep-2019 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup mtype mapping

Unify how we map the UAPI flags to the PTE hardware flags for a mapping.

Only the MTYPE is actually ASIC dependent, all other flags should be
copied over 1 to 1 and ASIC differences are handled later on.

Signed-off-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 87d2b92f 15-Aug-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: save umc error records

save umc error records to ras bad page array

v2: add bad pages before gpu reset
v3: add NULL check for adev->umc.funcs

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c5b6e585 02-Sep-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: change r type to int in gmc_v9_0_late_init

change r type from bool to int, suitable for both bool and int return
value

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a85eff14 02-Sep-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu/gmc: switch to amdgpu_gmc_ras_late_init helper function

amdgpu_gmc_ras_late_init is used to init gmc specfic
ras debugfs/sysfs node and gmc specific interrupt handler.
It can be shared among gmc generations.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d094aea3 02-Sep-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: set ip specific ras interface pointer to NULL after free it

to prevent access to dangling pointers

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7c6e68c7 13-Sep-2019 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Avoid HW GPU reset for RAS.

Problem:
Under certain conditions, when some IP bocks take a RAS error,
we can get into a situation where a GPU reset is not possible
due to issues in RAS in SMU/PSP.

Temporary fix until proper solution in PSP/SMU is ready:
When uncorrectable error happens the DF will unconditionally
broadcast error event packets to all its clients/slave upon
receiving fatal error event and freeze all its outbound queues,
err_event_athub interrupt will be triggered.
In such case and we use this interrupt
to issue GPU reset. THe GPU reset code is modified for such case to avoid HW
reset, only stops schedulers, deatches all in progress and not yet scheduled
job's fences, set error code on them and signals.
Also reject any new incoming job submissions from user space.
All this is done to notify the applications of the problem.

v2:
Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev
Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c
Remove print param from amdgpu_ras_query_error_count

v3:
Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset
for other XGMI hive memebers.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8bf2485a 31-Aug-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: fix memory leak when ras is not supported on specific ip block

free ras_if if ras is not supported

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4ce71be6 30-Aug-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: check mmhub_funcs pointer before refering to it

mmhub callback functions are not initialized for all the ASICs

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 093e48c0 26-Jul-2019 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Support new arcturus mtype

Arcturus repurposed mtype WC to RW. Modify gmc functions
to support the new mtype

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# dda79907 29-Aug-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: add mmhub ras_late_init callback function (v2)

The function will be called in late init phase to do mmhub
ras init

v2: check ras_late_init function pointer before invoking the
function

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2452e778 29-Aug-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: switch to amdgpu_ras_late_init for gmc v9 block (v2)

call helper function in late init phase to handle ras init
for gmc ip block

v2: call ras_late_fini to do clean up when fail to enable interrupt

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bebc0762 23-Aug-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: switch to new amdgpu_nbio structure

no functional change, just switch to new structures

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 994dcfaa 28-Aug-2019 Tianci.Yin <tianci.yin@amd.com>

drm/amdgpu: keep the stolen memory in visible vram region

stolen memory should be fixed in visible region.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Tianci.Yin <tianci.yin@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 53499173 16-Aug-2019 Xiaojie Yuan <xiaojie.yuan@amd.com>

drm/amdgpu: add dummy read for some GCVM status registers

The GRBM register interface is now capable of bursting 1 cycle per
register wr->wr, wr->rd much faster than previous muticycle per
transaction done interface. This has caused a problem where status
registers requiring HW to update have a 1 cycle delay, due to the
register update having to go through GRBM.

SW may operate on an incorrect value if they write a register and
immediately check the corresponding status register.

Registers requiring HW to clear or set fields may be delayed by 1 cycle.
For example,

1. write VM_INVALIDATE_ENG0_REQ mask = 5a
2. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a
a. HW will reset VM_INVALIDATE_ENG0_ACK = 0 until invalidation is complete
3. write VM_INVALIDATE_ENG0_REQ mask = 5a
4. read VM_INVALIDATE_ENG0_ACK till the ack is same as the request mask = 5a
a. First read of VM_INVALIDATE_ENG0_ACK = 5a instead of 0
b. Second read of VM_INVALIDATE_ENG0_ACK = 0 because
the remote GRBM h/w register takes one extra cycle to be cleared
c. In this case, SW will see a false ACK if they exit on first read

Affected registers (only GC variant) | Recommended Dummy Read
--------------------------------------+----------------------------
VM_INVALIDATE_ENG*_ACK | VM_INVALIDATE_ENG*_REQ
VM_L2_STATUS | VM_L2_STATUS
VM_L2_PROTECTION_FAULT_STATUS | VM_L2_PROTECTION_FAULT_STATUS
VM_L2_PROTECTION_FAULT_ADDR_HI/LO32 | VM_L2_PROTECTION_FAULT_ADDR_HI/LO32
VM_L2_IH_LOG_BUSY | VM_L2_IH_LOG_BUSY
MC_VM_L2_PERFCOUNTER_HI/LO | MC_VM_L2_PERFCOUNTER_HI/LO
ATC_L2_PERFCOUNTER_HI/LO | ATC_L2_PERFCOUNTER_HI/LO
ATC_L2_PERFCOUNTER2_HI/LO | ATC_L2_PERFCOUNTER2_HI/LO

Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com>
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9d4f837a 20-Aug-2019 Frank.Min <Frank.Min@amd.com>

drm/amdgpu: unity mc base address for arcturus

arcturus for sriov would use the unified mc base address

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Frank.Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 81c274c4 21-Aug-2019 Frank.Min <Frank.Min@amd.com>

drm/amdgpu: disable agp for sriov

Since agp is not used for sriov, just disable it

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Frank.Min <Frank.Min@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4e0ae5e2 13-Aug-2019 Yong Zhao <Yong.Zhao@amd.com>

drm/amdgpu: Add printing for RW extracted from VM_L2_PROTECTION_FAULT_STATUS

RW is also useful in most cases.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3ff98548 01-Aug-2019 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Export function to flush TLB of specific vm hub

This is for kfd to reuse amdgpu TLB invalidation function.
On gfx10, kfd only needs to flush TLB on gfx hub but not
on mm hub. So export a function for KFD flush TLB only on
specific hub.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 244511f3 15-Aug-2019 Christoph Hellwig <hch@lst.de>

drm/amdgpu: simplify and cleanup setting the dma mask

Use dma_set_mask_and_coherent to set both masks in one go, and remove
the no longer required fallback, as the kernel now always accepts
larger than required DMA masks. Fail the driver probe if we can't
set the DMA mask, as that means the system can only support a larger
mask.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8787ee01 24-Jul-2019 Huang Rui <ray.huang@amd.com>

drm/amdgpu: add gmc v9 supports for renoir

Add gfx memory controller support for renoir.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cb15e804 09-Aug-2019 Le Ma <le.ma@amd.com>

drm/amdgpu: add mmhub clock gating for Arcturus

Add 2 mmhub instances CG

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bee7b51a 08-Aug-2019 Le Ma <le.ma@amd.com>

drm/amdgpu: split athub clock gating from mmhub

Untie the bind of get/set athub CG state from mmhub, for cosmetic fix and Asic
not using mmhub 1.0. Besides, also fix wrong athub CG state in amdgpu_pm_info.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 145b03eb 06-Aug-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: create mmhub ras framework

enable mmhub ras feature and create sysfs/debugfs node for mmhub

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3d093da0 06-Aug-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: add amdgpu_mmhub_funcs definition

add amdgpu_mmhub_funcs definition and initialize it,
prepare for mmhub ras enablement

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bd2280da 01-Aug-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: replace AMDGPU_RAS_UE with AMDGPU_RAS_SUCCESS

ce can also trigger interrupt, and even both ce and ue error can be
found in one ras query, distinguishing between ce and ue in interrupt
handler is uncessary.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Suggested-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 91ba68f8 31-Jul-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: only uncorrectable error needs gpu reset

we only read error information for correctable error in interrupt
handler, gpu reset is unnecessary since there is no data lost
in correctable error

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 13b7c46c 31-Jul-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: add error address query for umc ras

umc error address query can get ce/ue error address and clear error
status

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3aacf4ea 29-Jul-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: initialize new parameters and functions for amdgpu_umc structure

add initialization for new members of amdgpu_umc structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4cd4c5c0 30-Jul-2019 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu: cleanup vega10 SRIOV code path

we can simplify all those unnecessary function under
SRIOV for vega10 since:
1) PSP L1 policy is by force enabled in SRIOV
2) original logic always set all flags which make itself
a dummy step

besides,
1) the ih_doorbell_range set should also be skipped
for VEGA10 SRIOV.
2) the gfx_common registers should also be skipped
for VEGA10 SRIOV.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 81e02619 22-Jul-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: update interrupt callback for all ras clients

add err_data parameter in interrupt cb for ras clients

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 045c0216 22-Jul-2019 Tao Zhou <tao.zhou1@amd.com>

drm/amdgpu: switch to amdgpu_umc structure

create new amdgpu_umc structure to for more umc
settings in future and switch to the new structure

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 939e2258 17-Jul-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: querry umc error count

check umc error count in both ras querry function and
ras interrupt handler

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5b6b35aa 17-Jul-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: init umc v6_1 functions for vega20

init umc callback function for vega20 in sw early init phase

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Dennis Li <dennis.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5ddd4a9a 30-Jun-2019 Yong Zhao <Yong.Zhao@amd.com>

drm/amdgpu: Add more detail to the VM fault printing

With the printing, we don't need to parse the value on our own any more.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bfa3a9bb 27-Jun-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: keep stolen memory for arct

Any dce register read back from arct is invalid. use hard code
stolen memory for arct until we validate the s3.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Le Ma <Le.Ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f2d66571 10-Sep-2018 Le Ma <le.ma@amd.com>

drm/amdgpu: skip pasid mapping for second mmhub on Arcturus

There's no LUT register for second mmhub to convert pasid since it has no ATC.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 51c60898 06-Sep-2018 Le Ma <le.ma@amd.com>

drm/amdgpu: update vmc interrupt routine to support 3 vmhubs

There is one more vmc interrupt and mmhub on Arcturus.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Snow Zhang < Snow.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7d19b15f 06-Sep-2018 Le Ma <le.ma@amd.com>

drm/amdgpu: add VMC1 interrupt client id for Arcturus

New IH client id for VMC1.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Snow Zhang < Snow.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 51cce480 04-Sep-2018 Le Ma <le.ma@amd.com>

drm/amdgpu: use new mmhub interfaces for Arcturus

Arcturus has two MMHUBs.

Signed-off-by: Le Ma <le.ma@amd.com>
Acked-by: Snow Zhang < Snow.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c8a6e2a3 31-Aug-2018 Le Ma <le.ma@amd.com>

drm/amdgpu: add one more mmhub instance for Arcturus (v2)

v2: set mmhub num under CHIP_ARCTURUS switch case and add one more mmhub id_mgr

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1daa2bfa 31-Aug-2018 Le Ma <le.ma@amd.com>

drm/amdgpu: add new member in amdgpu_device for vmhub counts per asic chip

It aims to replace AMDGPU_MAX_VMHUBS in for loop to initialize registers.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a2d15ed7 16-Jul-2019 Le Ma <le.ma@amd.com>

drm/amdgpu: rename AMDGPU_GFXHUB/MMHUB macro with hub number

The number of GFXHUB/MMHUB may be expanded in later ASICs.

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3de2ff5d 04-Sep-2018 Le Ma <le.ma@amd.com>

drm/amdgpu: add gmc basic support for Arcturus

Add initial GMC support for Arcturus

Signed-off-by: Le Ma <le.ma@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7596ab68 25-Jun-2018 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amd/gmc9: rename AMDGPU_PTE_MTYPE to AMDGPU_PTE_MTYPE_VG10

To differentiate the mtypes across asics.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 450f30ea 13-Jun-2019 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

amdgpu: no need to check return value of debugfs_create functions

When calling debugfs functions, there is no need to ever check the
return value. The function can work or not, but the code logic should
never do something different based on this.

Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: xinhui pan <xinhui.pan@amd.com>
Cc: Evan Quan <evan.quan@amd.com>
Cc: Feifei Xu <Feifei.Xu@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f867723b 09-Jun-2019 Sam Ravnborg <sam@ravnborg.org>

drm/amd: drop use of drmP.h in amdgpu.h

Delete the unused drmP.h from amdgpu.h.
Fix fallout in various files.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
Cc: David Airlie <airlied@linux.ie>
Cc: Daniel Vetter <daniel@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20190609220757.10862-5-sam@ravnborg.org


# 02122753 28-May-2019 Flora Cui <flora.cui@amd.com>

drm/amdgpu: reserve stollen vram for raven series

to avoid screen corruption during modprobe.

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fe2b5323 13-May-2019 Tiecheng Zhou <Tiecheng.Zhou@amd.com>

drm/amdgpu/sriov: Need to initialize the HDP_NONSURFACE_BAStE

it requires to initialize HDP_NONSURFACE_BASE, so as to avoid
using the value left by a previous VM under sriov scenario.

v2: it should not hurt baremetal, generalize it for both sriov
and baremetal

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Tiecheng Zhou <Tiecheng.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6121366b 08-May-2019 xinhui pan <xinhui.pan@amd.com>

drm/amdgpu: gmc handle ras resume

During S3/S4 bootloader will re-init ras state behind us.
Resume might fail or raise a gpu reset.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Tested-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 98cad2de 03-Mar-2019 Trigger Huang <Trigger.Huang@amd.com>

drm/amdgpu: Skip setting some regs under Vega10 VF

For Vega10 SR-IOV VF, skip setting some regs due to:
1, host will program them
2, avoid VF register programming violations

Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 36810fdb 08-May-2019 xinhui pan <xinhui.pan@amd.com>

drm/amdgpu: gmc support ras gpu reset

request a gpu reset if ras return EAGAIN.
we will run late init again so it is ok to do nothing this time.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 067e75b3 17-May-2019 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: set vram_width properly for SR-IOV

For SR-IOV, vram_width can't be read from ATOM as
RAVEN, and DF related registers is not readable, so hardcord
is the only way to set the correct vram_width.

Reviewed-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Signed-off-by: Yintian Tao <yttao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 37910935 16-May-2019 Flora Cui <flora.cui@amd.com>

drm/amdgpu: keep stolen memory on picasso

otherwise screen corrupts during modprobe.

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 14cfde84 09-Apr-2019 xinhui pan <xinhui.pan@amd.com>

drm/amdgpu: Add a check to avoid panic because of unexpected irqs

IP initialize ras in late_init, because of the BUGs of PSP or any
other components, driver receives unexpected irqs. It is ok to add such
check anyway.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 53d65054 08-Apr-2019 xinhui pan <xinhui.pan@amd.com>

drm/amdgpu: gmc use amdgpu_ras_feature_enable_on_boot

handle ras enable on boot.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Evan Quan <evan.quan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c1a8abd9 07-Nov-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: use ring/hash for fault handling on GMC9 v3

Further testing showed that the idea with the chash doesn't work as expected.
Especially we can't predict when we can remove the entries from the hash again.

So replace the chash with a ring buffer/hash mix where entries in the container
age automatically based on their timestamp.

v2: use ring buffer / hash mix
v3: check the timeout to make sure all entries age

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> (v2)
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f49ea9f8 06-Mar-2019 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: query sram ecc/ecc availability from atombios

query sram ecc capability via amdgpu_atomfirmware_ecc_default_enabled
query ecc availability via amdgpu_atomfirmware_sram_ecc_supported

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# acbbee01 06-Mar-2019 xinhui pan <xinhui.pan@amd.com>

drm/amdgpu: handle ras resume

Suspend will put irq, so resume need get irq back.
And in the same time, skip other ras initialization.

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9b54d201 11-Jan-2019 Eric Huang <JinhuiEric.Huang@amd.com>

drm/amdkfd: add RAS ECC event support (v3)

RAS ECC event will combine with GPU reset event, due to
ECC interrupts are caused by uncorrectable error that triggers
GPU reset.

v2: Fix misleading-indentation warning
v3: fix build with CONFIG_HSA_AMD disabled

Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 791c4769 23-Jan-2019 xinhui pan <xinhui.pan@amd.com>

drm/amdgpu: enable ras on gmc9

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 83afe835 07-Mar-2019 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Cosmetic change for calling func amdgpu_gmc_vram_location

Use function parameter mc as the second parameter of amdgpu_gmc_vram_location,
so codes look more consistent.

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6490bd76 24-Feb-2019 Yong Zhao <Yong.Zhao@amd.com>

drm/amdgpu: Eliminate the set_pde_pte function pointer in amdgpu_gmc_funcs

All the gmc_*_set_pde_pte functions are the same across different ASICs,
so we can eliminate the set_pde_pte function pointer and instead use a
generic function.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 72464382 18-Mar-2019 Christian König <christian.koenig@amd.com>

drm/amdgpu: fix invalid use of change_bit

We only need to clear the bit in a 32bit integer.

This fixes a crah on ARM64 and PPC64LE caused by
"drm/amdgpu: update the vm invalidation engine layout V2"

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 913b2cb7 19-Feb-2019 Michael D Labriola <michael.d.labriola@gmail.com>

drm: change func to better detect wether swiotlb is needed

This commit fixes DRM failures on Xen PV systems that were introduced in
v4.17 by the following commits:

82626363 drm: add func to get max iomem address v2
fd5fd480 drm/amdgpu: only enable swiotlb alloc when need v2
1bc3d3cc drm/radeon: only enable swiotlb path when need v2

The introduction of ->need_swiotlb to the ttm_dma_populate() conditionals
in the radeon and amdgpu device drivers causes Gnome to immediately crash
on Xen PV systems, returning the user to the login screen. The following
kernel errors get logged:

[ 28.554259] radeon_dp_aux_transfer_native: 200 callbacks suppressed
[ 31.219821] radeon 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes)
[ 31.220030] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (16384000, 2, 4096, -14)
[ 31.226109] radeon 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes)
[ 31.226300] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (16384000, 2, 4096, -14)
[ 31.300734] gnome-shell[1935]: segfault at 88 ip 00007f39151cd904 sp 00007ffc97611ad8 error 4 in libmutter-cogl.so[7f3915178000+aa000]
[ 31.300745] Code: 5f c3 0f 1f 40 00 48 8b 47 78 48 8b 40 40 ff e0 66 0f 1f 44 00 00 48 8b 47 78 48 8b 40 48 ff e0 66 0f 1f 44 00 00 48 8b 47 78 <48> 8b 80 88 00 00 00 ff e0 0f 1f 00 48 8b 47 78 48 8b 40 68 ff e0
[ 38.193302] radeon_dp_aux_transfer_native: 116 callbacks suppressed
[ 40.009317] radeon 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes)
[ 40.009488] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (16384000, 2, 4096, -14)
[ 40.015114] radeon 0000:01:00.0: swiotlb buffer is full (sz: 2097152 bytes)
[ 40.015297] [drm:radeon_gem_object_create [radeon]] *ERROR* Failed to allocate GEM object (16384000, 2, 4096, -14)
[ 40.028302] gnome-shell[2431]: segfault at 2dadf40 ip 0000000002dadf40 sp 00007ffcd24ea5f8 error 15
[ 40.028306] Code: 20 6e 31 00 00 00 00 00 00 00 00 37 e3 3d 2d 7f 00 00 80 f4 e6 3d 2d 7f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 <00> 00 00 00 00 00 00 00 c1 00 00 00 00 00 00 00 80 e1 d2 03 00 00

This commit renames drm_get_max_iomem() to drm_need_swiotlb(), adds a
xen_pv_domain() check to it, and moves the bit shifting comparison that
always follows its usage into the function (simplifying the drm driver
code).

Signed-off-by: Michael D Labriola <michael.d.labriola@gmail.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/286987/


# cdba61da 23-Jan-2019 wentalou <Wentao.Lou@amd.com>

drm/amdgpu: sriov restrict max_pfn below AMDGPU_GMC_HOLE

sriov need to restrict max_pfn below AMDGPU_GMC_HOLE.
access the hole results in a range fault interrupt IIRC.

Signed-off-by: Wentao Lou <Wentao.Lou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c468f9e2 11-Dec-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: improve GMC v9 page fault message

Note if this is a retry fault or not and cleanup the message a bit.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 05794eff 20-Dec-2018 Shirish S <shirish.s@amd.com>

drm/amdgpu/gmc: fix compiler errors [-Werror,-Wmissing-braces] (V2)

Initializing structures with { } is known to be problematic since
it doesn't necessararily initialize all bytes, in case of padding,
causing random failures when structures are memcmp().

This patch fixes the structure initialisation related compiler
error by memset().

V2: rectified missing piece in coding

Signed-off-by: Shirish S <shirish.s@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c713a461 20-Nov-2018 Evan Quan <evan.quan@amd.com>

drm/amdgpu: update the vm invalidation engine layout V2

We need new invalidation engine layout due to new SDMA page
queues added.

V2: fix coding style and add correct return value

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 22666cc1 26-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: move IV prescreening into the GMC code

The GMC/VM subsystem is causing the faults, so move the handling here as
well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 30da7bb1 26-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: add missing error handling

We ignored the return code here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 47622ba0 30-Nov-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add a xgmi supported flag

Use this to track whether an asic supports xgmi rather than
checking the asic type everywhere.

Reviewed-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 82d1a1b1 15-Nov-2018 Chengming Gui <Jack.Gui@amd.com>

Revert "drm/amdgpu: use GMC v9 KIQ workaround only for the GFXHUB" (v2)

With GFXOFF enabled, this patch will cause PCO amdgpu_test failed,
but GFXOFF is necessary for PCO, so revert the patch.

This reverts commit b83761bb0b09ec11c924afe9d88e458cb16a0372.

v2: add a comment for future reference (Alex)

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jack Gui <Jack.Gui@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b83761bb 25-Oct-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: use GMC v9 KIQ workaround only for the GFXHUB

The MMHUB is not affected by this.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 396557b0 25-Oct-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: drop the busy wait for GMC v9 TLB invalidations

This code is not performance critical.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# af5fe1e9 25-Oct-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup GMC v9 TLB invalidation

Move the kiq handling into amdgpu_virt.c and drop the fallback.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6e82c6e0 30-Oct-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: drop the remaining uses of ring idx in messages

Consistently use the ring name instead.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c66ed765 19-Oct-2018 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Retire amdgpu_ring.ready flag v4

Start using drm_gpu_scheduler.ready isntead.

v3:
Add helper function to run ring test and set
sched.ready flag status accordingly, clean explicit
sched.ready sets from the IP specific files.

v4: Add kerneldoc and rebase.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2a79d868 12-Oct-2018 Yong Zhao <Yong.Zhao@amd.com>

drm/amdgpu: Reorganize amdgpu_gmc_flush_gpu_tlb() for kfd to use

Add a flush_type parameter to that series of functions.

Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f54b30d7 17-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: make function pointers mandatory

We always want those to be setup correctly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 741deade 13-Sep-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: simplify Raven, Raven2, and Picasso handling

Treat them all as Raven rather than adding a new picasso
asic type. This simplifies a lot of code and also handles the
case of rv2 chips with the 0x15d8 pci id. It also fixes dmcu
fw handling for picasso.

Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 91468057 20-Aug-2018 Kenneth Feng <kenneth.feng@amd.com>

drm/amdgpu: enable mmhub power gating

Remove some functions due to the design change.
All the mmhub power gating sequence is moved to
smu fw.Driver sends the message to enable mmhub
powergating.We can also skip the fw version check
since the old fw version is in a very early stage
and we don't use that fw for release.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e21f561a 10-Jul-2018 Likun Gao <Likun.Gao@amd.com>

drm/amdgpu: add picasso support for gmc

Same as raven.

Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6fdd68b1 19-Jun-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: Adjust GART and AGP location with xgmi offset (v2)

On hives with xgmi enabled, the fb_location aperture is a size
which defines the total framebuffer size of all nodes in the
hive. Each GPU in the hive has the same view via the fb_location
aperture. GPU0 starts at offset (0 * segment size),
GPU1 starts at offset (1 * segment size), etc.

For access to local vram on each GPU, we need to take this offset into
account. This including on setting up GPUVM page table and GART table

v2: squash in "drm/amdgpu: Init correct fb region for none XGMI configuration"

Acked-by: Huang Rui <ray.huang@amd.com>
Acked-by: Slava Abramov <slava.abramov@amd.com>
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>


# bf0a60b7 19-Jun-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: add a new gfxhub 1.1 helper for xgmi

Used to populate the xgmi info on vega20.

v2: PF_MAX_REGION is val - 1 (Ray)

Acked-by: Huang Rui <ray.huang@amd.com>
Acked-by: Slava Abramov <slava.abramov@amd.com>
Reviewed-by :Shaoyun liu <Shaoyun.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by :Shaoyun liu <Shaoyun.liu@amd.com>


# c3e1b43c 27-Aug-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: enable AGP aperture for GMC9 v2

Enable the old AGP aperture to avoid GART mappings.

v2: don't enable it for SRIOV

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6abc0c8f 30-Aug-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: don't keep stolen memory on vega20

Vega20 does not appear to be affected by the same issue
as vega10. Enable the full stolen memory handling on
vega20. Reserve the appropriate size at init time to avoid
display artifacts and then free it at the end of init once
the new FB is up and running.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 95010ba7 30-Aug-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: don't keep stolen memory on vega12

vega12 does not appear to be affected by the same issue
as vega10. Enable the full stolen memory handling on
vega12. Reserve the appropriate size at init time to avoid
display artifacts and then free it at the end of init once
the new FB is up and running.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6fb81375 30-Aug-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: don't keep stolen memory on Raven

Raven does not appear to be affected by the same issue
as vega10. Enable the full stolen memory handling on
Raven. Reserve the appropriate size at init time to avoid
display artifacts and then free it at the end of init once
the new FB is up and running.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106639
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cd2b5623 30-Aug-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: rework stolen vga memory handling

No functional change, just rework it in order to adjust the
behavior on a per asic level. The problem is that on vega10,
something corrupts the lower 8 MB of vram on the second
resume from S3. This does not seem to affect Raven, other
gmc9 based asics need testing.

Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 961c75cf 23-Aug-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: move amdgpu_device_(vram|gtt)_location

Move that into amdgpu_gmc.c since we are really deadling with GMC
address space here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0c79c0bb 27-Aug-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: remove extra newline when printing VM faults

Looks like a copy&paste error to me.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7d0aa376 27-Aug-2018 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Refine gmc9 VM fault print.

The fault reports the page number where the fault happend and not
the exact faulty address. Update the print message to reflect that.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bbc9fb10 21-Aug-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: add GMC9 support for PDs/PTs in system memory

Add the necessary handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 11c3a249 21-Aug-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: add amdgpu_gmc_pd_addr helper

Add a helper to get the root PD address and remove the workarounds from
the GMC9 code for that.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4e830fb1 21-Aug-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: remove gart.table_addr

We can easily figure out the address on the fly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1123b989 21-Aug-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: rename gart.robj into gart.bo

sed -i "s/gart.robj/gart.bo/" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/gart.robj/gart.bo/" drivers/gpu/drm/amd/amdgpu/*.h

Just cleaning up radeon leftovers.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1849e737 21-Aug-2018 kbuild test robot <fengguang.wu@intel.com>

drm/amdgpu: amdgpu_kiq_reg_write_reg_wait() can be static

Fixes: d790449835e6 ("drm/amdgpu: use kiq to do invalidate tlb")
Reviewed-by: Emily Deng <Emily.Deng@amd.com>
Signed-off-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ae74da3e 22-Aug-2018 Emily Deng <Emily.Deng@amd.com>

drm/amdgpu: Don't use kiq in gpu reset

When in gpu reset, don't use kiq, it will generate more TDR.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fc0faf04 21-Aug-2018 Emily Deng <Emily.Deng@amd.com>

drm/amdgpu/sriov: Only sriov runtime support use kiq

For sriov, don't use kiq in exclusive mode, as don't know how long time
it will take, some times it will occur exclusive timeout.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3890d111 17-Aug-2018 Emily Deng <Emily.Deng@amd.com>

drm/amdgpu: use kiq to do invalidate tlb

To avoid the tlb flush not interrupted by world switch, use kiq and one
command to do tlb invalidate.

v2:
Refine the invalidate lock position.

Signed-off-by: Emily Deng <Emily.Deng@amd.com>
Reviewed-and-Tested-by: Rex Zhu <Rex.Zhu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2cddc50e 13-Aug-2018 Huang Rui <ray.huang@amd.com>

drm/amdgpu: move gem definitions into amdgpu_gem header

Demangle amdgpu.h.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a3d9103e 22-Aug-2018 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Fix page fault and kasan warning on pci device remove.

Problem:
When executing echo 1 > /sys/class/drm/card0/device/remove kasan warning
as bellow and page fault happen because adev->gart.pages already freed by the
time amdgpu_gart_unbind is called.

BUG: KASAN: user-memory-access in amdgpu_gart_unbind+0x98/0x180 [amdgpu]
Write of size 8 at addr 0000000000003648 by task bash/1828
CPU: 2 PID: 1828 Comm: bash Tainted: G W O 4.18.0-rc1-dev+ #29
Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming/AX370-Gaming-CF, BIOS F3 06/19/2017
Call Trace:
dump_stack+0x71/0xab
kasan_report+0x109/0x390
amdgpu_gart_unbind+0x98/0x180 [amdgpu]
ttm_tt_unbind+0x43/0x60 [ttm]
ttm_bo_move_ttm+0x83/0x1c0 [ttm]
ttm_bo_handle_move_mem+0xb97/0xd00 [ttm]
ttm_bo_evict+0x273/0x530 [ttm]
ttm_mem_evict_first+0x29c/0x360 [ttm]
ttm_bo_force_list_clean+0xfc/0x210 [ttm]
ttm_bo_clean_mm+0xe7/0x160 [ttm]
amdgpu_ttm_fini+0xda/0x1d0 [amdgpu]
amdgpu_bo_fini+0xf/0x60 [amdgpu]
gmc_v8_0_sw_fini+0x36/0x70 [amdgpu]
amdgpu_device_fini+0x2d0/0x7d0 [amdgpu]
amdgpu_driver_unload_kms+0x6a/0xd0 [amdgpu]
drm_dev_unregister+0x79/0x180 [drm]
amdgpu_pci_remove+0x2a/0x60 [amdgpu]
pci_device_remove+0x5b/0x100
device_release_driver_internal+0x236/0x360
pci_stop_bus_device+0xbf/0xf0
pci_stop_and_remove_bus_device_locked+0x16/0x30
remove_store+0xda/0xf0
kernfs_fop_write+0x186/0x220
__vfs_write+0xcc/0x330
vfs_write+0xe6/0x250
ksys_write+0xb1/0x140
do_syscall_64+0x77/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f66ebbb32c0

Fix:
Split gmc_v{6,7,8,9}_0_gart_fini to postpone amdgpu_gart_fini to after
memory managers are shut down since gart unbind happens
as part of this procedure

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# eb7e5cfc 22-Aug-2018 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Fix page fault and kasan warning on pci device remove.

Problem:
When executing echo 1 > /sys/class/drm/card0/device/remove kasan warning
as bellow and page fault happen because adev->gart.pages already freed by the
time amdgpu_gart_unbind is called.

BUG: KASAN: user-memory-access in amdgpu_gart_unbind+0x98/0x180 [amdgpu]
Write of size 8 at addr 0000000000003648 by task bash/1828
CPU: 2 PID: 1828 Comm: bash Tainted: G W O 4.18.0-rc1-dev+ #29
Hardware name: Gigabyte Technology Co., Ltd. AX370-Gaming/AX370-Gaming-CF, BIOS F3 06/19/2017
Call Trace:
dump_stack+0x71/0xab
kasan_report+0x109/0x390
amdgpu_gart_unbind+0x98/0x180 [amdgpu]
ttm_tt_unbind+0x43/0x60 [ttm]
ttm_bo_move_ttm+0x83/0x1c0 [ttm]
ttm_bo_handle_move_mem+0xb97/0xd00 [ttm]
ttm_bo_evict+0x273/0x530 [ttm]
ttm_mem_evict_first+0x29c/0x360 [ttm]
ttm_bo_force_list_clean+0xfc/0x210 [ttm]
ttm_bo_clean_mm+0xe7/0x160 [ttm]
amdgpu_ttm_fini+0xda/0x1d0 [amdgpu]
amdgpu_bo_fini+0xf/0x60 [amdgpu]
gmc_v8_0_sw_fini+0x36/0x70 [amdgpu]
amdgpu_device_fini+0x2d0/0x7d0 [amdgpu]
amdgpu_driver_unload_kms+0x6a/0xd0 [amdgpu]
drm_dev_unregister+0x79/0x180 [drm]
amdgpu_pci_remove+0x2a/0x60 [amdgpu]
pci_device_remove+0x5b/0x100
device_release_driver_internal+0x236/0x360
pci_stop_bus_device+0xbf/0xf0
pci_stop_and_remove_bus_device_locked+0x16/0x30
remove_store+0xda/0xf0
kernfs_fop_write+0x186/0x220
__vfs_write+0xcc/0x330
vfs_write+0xe6/0x250
ksys_write+0xb1/0x140
do_syscall_64+0x77/0x1e0
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7f66ebbb32c0

Fix:
Split gmc_v{6,7,8,9}_0_gart_fini to postpone amdgpu_gart_fini to after
memory managers are shut down since gart unbind happens
as part of this procedure

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6cdf4e87 24-Jul-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: clarify GPUVM fault error message

The address printed is the actual address, not the page.

Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 44a99b65 25-May-2018 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amd: Use newly added interrupt source defs for SOC15.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# efaa9646 28-Jun-2018 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Present amdgpu_task_info in VM_FAULTS.

Extract and present the reposnsible process and thread when
VM_FAULT happens.

v2: Use getter and setter functions.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Jim Qu <Jim.Qu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e1d1a772 10-May-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: disable partial wr rmw if ECC is not enabled

The vbios mistakenly sets this bit on some boards without ECC.
This can lead to reduced performance in some workloads. Disable
the bit if the board does not have ECC.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d96b428c 19-Apr-2018 Feifei Xu <Feifei.Xu@amd.com>

drm/amdgpu/gmc9: Add vega20 support

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b6110c00 05-Feb-2018 Feifei Xu <Feifei.Xu@amd.com>

drm/amdgpu: Fix hardcoded base offset of vram pages

In gmc_v9_0_vram_gtt_location(),the vram_base_offset is hardcoded
to 0 in dGPU. Fix it by reading mmMC_VM_FB_OFFSET or return
zfb_phys_addr if ZFB is enabled.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 323a9dbc 10-May-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: remove unused register defs

These got moved to the new df module so no longer
used in this file.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bfa8eea2 18-Apr-2018 Flora Cui <Flora.Cui@amd.com>

drm/amdgpu: init gfx9 aperture settings

fix settings.

Signed-off-by: Flora Cui <Flora.Cui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6f752ec2 06-Apr-2018 Andrey Grodzovsky <andrey.grodzovsky@amd.com>

drm/amdgpu: Free VGA stolen memory as soon as possible.

Reserved VRAM is used to avoid overriding pre OS FB.
Once our display stack takes over we don't need the reserved
VRAM anymore.

v2:
Remove comment, we know actually why we need to reserve the stolen VRAM.
Fix return type for amdgpu_ttm_late_init.
v3:
Return 0 in amdgpu_bo_late_init, rebase on changes to previous patch
v4: rebase
v5:
For GMC9 reserve always just 9M and keep the stolem memory around
until GART table curruption on S3 resume is resolved.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ebdef28e 06-Apr-2018 Alex Deucher <alexdeucher@gmail.com>

drm/amdgpu/gmc: steal the appropriate amount of vram for fw hand-over (v3)

Steal 9 MB for vga emulation and fb if vga is enabled, otherwise,
steal enough to cover the current display size as set by the vbios.

If no memory is used (e.g., secondary or headless card), skip
stolen memory reserve.

v2: skip reservation if vram is limited, address Christian's comments
v3: squash in fix from Harry

Reviewed-and-Tested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> (v2)
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>


# f8bc9037 27-Mar-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: use amdgpu_ring_emit_reg_write_reg_wait in gpu tlb flush

Use amdgpu_ring_emit_reg_write_reg_wait. On engines that support it,
it provides a write and wait in a single packet which avoids a missed
ack if a world switch happens between the request and waiting for the
ack.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 070706c0 28-Mar-2018 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: switch to use df callback functions

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 273a14cd 13-Mar-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: add vega12 support (v2)

Same as vega10.

v2: squash in golden regs fix from Feifei

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>


# 3760f76c 08-Mar-2018 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Move IH clientid defs to separate file

This is preparation for sharing client ID definitions
between amdgpu and amdkfd

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1e09b053 08-Mar-2018 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: query vram type from atombios

The vram type for dGPU is stored in umc_info while sys mem type
for APU is stored in integratedsysteminfo

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fd430a70 18-Jan-2018 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu: skip ECC for SRIOV in gmc late_init

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 82d0ece9 26-Feb-2018 Tom St Denis <tom.stdenis@amd.com>

drm/amd/amdgpu: Correct VRAM width for APUs with GMC9

DDR4 has a 64-bit width not 128-bits. It was reporting
twice the width. Tested with my Ryzen 2400G.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fe19b862 23-Jan-2018 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu: increase gart size to 512MB

256MB is too small consider PTE/PDE shadow and TTM
eviction activity

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7b6cbae2 18-Jan-2018 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu: skip ECC for SRIOV in gmc late_init

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 585b7f16 26-Feb-2018 Tom St Denis <tom.stdenis@amd.com>

drm/amd/amdgpu: Correct VRAM width for APUs with GMC9

DDR4 has a 64-bit width not 128-bits. It was reporting
twice the width. Tested with my Ryzen 2400G.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c633c00b 04-Feb-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: separate PASID mapping from VM flush v2

Stuffing the PASID mapping into the VM flush isn't flexible enough since
the PASID mapping changes not as often as we need a VM flush.

v2: add missing use of gmc_v7_0_emit_pasid_mapping

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3d918c0e 06-Feb-2018 Shaoyun Liu <Shaoyun.Liu@amd.com>

drm/amdgpu: Avoid get vram info from atom bios on emulation mode

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f732b6b3 26-Jan-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: move waiting for VM flush into gmc_v9_0_emit_flush_gpu_tlb

Keep that at a common place instead of spread over all engines.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 69882565 19-Jan-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: add optional ring to *_hdp callbacks

This adds an optional ring to the invalidate_hdp and flush_hdp
callbacks. If the ring isn't specified or the emit_wreg function not
available the HDP operation will be done with the CPU otherwise by
writing on the ring.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 250b4228 16-Jan-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: add PASID mapping for GMC v9

This way we can see the PASID in VM faults.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9096d6e5 12-Jan-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: implement gmc_v9_0_emit_flush_gpu_tlb

Unify tlb flushing for gmc v9.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 156a81be 17-Jan-2018 Chunming Zhou <david1.zhou@amd.com>

drm/amdgpu: all vram is visible for APU (v2)

missed in gmc9.

v2: squash in build fix (Rex)

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 132f34e4 12-Jan-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: move struct gart_funcs into amdgpu_gmc.h

And rename it to struct gmc_funcs.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Samuel Li <Samuel.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 770d13b1 12-Jan-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: move struct amdgpu_mc into amdgpu_gmc.h

And rename it to amdgpu_gmc as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Samuel Li <Samuel.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3816e42f 09-Jan-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: rename pas_id to pasid

sed -i "s/pas_id/pasid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/pas_id/pasid/g" drivers/gpu/drm/amd/amdgpu/*.h

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b1d12868 05-Jan-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: adjust HDP write queue flushing for tlb invalidation

Separate tlb invalidation and hdp flushing and move the HDP
flush to the caller.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fd5fd480 08-Feb-2018 Chunming Zhou <david1.zhou@amd.com>

drm/amdgpu: only enable swiotlb alloc when need v2

get the max io mapping address of system memory to see if it is over
our card accessing range.
v2: move checking later

Signed-off-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180209024410.1469-2-david1.zhou@amd.com


# 5ba4fa35 17-Jan-2018 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: only check for ECC on Vega10

RV doesn't support it.

Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# c4f46f22 18-Dec-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: rename vm_id to vmid

sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.h

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 620f774f 18-Dec-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: separate VMID and PASID handling

Move both into the new files amdgpu_ids.[ch]. No functional change.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6a42fd6f 05-Dec-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: implement 2+1 PD support for Raven v3

Instead of falling back to 2 level and very limited address space use
2+1 PD support and 128TB + 512GB of virtual address space.

v2: cleanup defines, rebase on top of level enum
v3: fix inverted check in hardware setup

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-and-Tested-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2543e28a 14-Dec-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: rename amdgpu_*_location functions

add device to the name for consistency.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9c3f2b54 14-Dec-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: rename amdgpu_program_register_sequence

add device for consistency with other functions in this file.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a7ea6548 08-Dec-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: setup the shared and private apertures on gfx9

Same as previous asics. This was not yet set for gfx9.

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bf383fb6 08-Dec-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: convert nbio to use callbacks (v2)

Cleans up and consolidates all of the per-asic logic.

v2: squash in "drm/amdgpu: fix NULL err for sriov detect" (Chunming)

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3de676d8 29-Nov-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: allow get_vm_pde to change flags as well

And also provide the level for which we need a PDE.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4fd09a19 29-Nov-2017 Shaoyun Liu <Shaoyun.Liu@amd.com>

drm/admgpu: Reduce the usage of soc15ip.h

Remove the header where it's not used.

Acked-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 946a4d5b 28-Nov-2017 Shaoyun Liu <Shaoyun.Liu@amd.com>

drm/amdgpu: Avoid use SOC15_REG_OFFSET in static const array

Handle dynamic offsets correctly in static arrays.

Acked-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f3368128 22-Nov-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: move validation of the VM size into the VM code

This moves validation of the VM size parameter into amdgpu_vm_adjust_size().

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b38f41eb 22-Nov-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: unify VM size handling of Vega10 with older generation

One function to rule them all.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fb960bd2 23-Nov-2017 Feifei Xu <Feifei.Xu@amd.com>

drm/amd/include:cleanup vega10 header files.

Remove asic_reg/vega10 folder.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 65417d9f 22-Nov-2017 Feifei Xu <Feifei.Xu@amd.com>

drm/amd/include:cleanup vega10 mmhub header files.

Cleanup asic_reg/vega10/MMHUB folder.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# cde5c34f 23-Nov-2017 Feifei Xu <Feifei.Xu@amd.com>

drm/amd/include:cleanup vega10 gc header files.

Cleanup asic_reg/vega10/GC folder.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 135d4b10 23-Nov-2017 Feifei Xu <Feifei.Xu@amd.com>

drm/amd/include:cleanup vega10 dce header files.

Cleanup asic_reg/vega10/DC folder.Remove dce_12_0_default.h.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 90c7a935 22-Nov-2017 Feifei Xu <Feifei.Xu@amd.com>

drm/amd/include: cleanup vega10 umc header files.

Remove asic/vega10/UMC folder.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6ce68225 16-Nov-2017 Feifei Xu <Feifei.Xu@amd.com>

drm/amd/include:cleanup vega10 athub header files.

Cleanup asic_reg/vega10/ATHUB folder,remove unused files.

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 75199b8c 15-Nov-2017 Feifei Xu <Feifei.Xu@amd.com>

drm/amd/include:cleanup vega10 hdp header files.

Cleanup asic_reg/vega10/HDP folder, remove hdp_4_0_default.h

Signed-off-by: Feifei Xu <Feifei.Xu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ce1b1b66 20-Nov-2017 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu:partially revert 1cfd8e237f0318e330190ac21d63c58ae6a1f66c

found RING0 test fail after S3 resume regression, which is
introduced by 1cfd8e237f0318e330190ac21d63c58ae6a1f66c

Because after suspend VRAM will be cleared, so driver must
unpin the GART table(resident in VRAM) during suspend so it
can be evicted to system ram and must correspondingly pin it
during resume so the GART table could be restored to VRAM.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5a16008f 17-Nov-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: make some ECC messages debug only

To avoid spamming the logs on non-ECC boards.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f59548c8 13-Nov-2017 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu:fix NULL pointer access during drv remove

NULL pointer is because original logic will step into
set_pde_pte() even after the gart.ptr is freed due to
there are twice gart_unbind() on all gart area.

also, there are other minor fixes:
1,since gart_init only create dummy page, the corresponding
gart_fini shouldn't do more like unbinding all GART, this is
unnecessary because in driver fini stage all GART unbinding
had already been done during each IP's SW_FINI (GMC's
SW_FINI is the last one called), so remove the step
for the GART unbinding in gart_fini().

2,gart_fini() is already invoked during each GMC IP's gart_fini
routine,e.g. gmc_vx_0_gart_fini(), so no need to manually
call it during ttm_fini().

3,amdgpu_gem_force_release() should be put ahead of
amdgpu_vm_manager_fini()

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c79ee7d8 13-Nov-2017 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu:cleanup GMC & gart garbage function

for gart_ram_alloc/free, they are never used in driver thus
ripe them out totally.

for gart_vram_pin/unpin, they are not needed becuase we can
use bo_creat_kernel/free to replace the original manual way
in the gart_vram_alloc/free, thus gart_vram_pin/unpin can
also be riped out.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 02bab923 15-Sep-2017 David Panariti <David.Panariti@amd.com>

drm/amdgpu: Add ability to determine and report if board supports ECC.

Make initialization code check the ECC related registers, which are initialized
by the VBIOS, to see if ECC is present and initialized and DRM_INFO() the
result.

Signed-off-by: David Panariti <David.Panariti@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fdd5faaa 04-Nov-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup vm_size handling

It's pointless to have the same value twice, just always use max_pfn.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c47b41a7 03-Nov-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: remove nonsense const u32 cast on ARRAY_SIZE result

Not sure what that should originally been good for, but it doesn't seem
to make any sense any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d6895ad3 28-Feb-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: resize VRAM BAR for CPU access v6

Try to resize BAR0 to let CPU access all of VRAM.

v2: rebased, style cleanups, disable mem decode before resize,
handle gmc_v9 as well, round size up to power of two.
v3: handle gmc_v6 as well, release and reassign all BARs in the driver.
v4: rename new function to amdgpu_device_resize_fb_bar,
reenable mem decoding only if all resources are assigned.
v5: reorder resource release, return -ENODEV instead of BUG_ON().
v6: squash in rebase fix

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c5066129 06-Jun-2017 ozeng <oak.zeng@amd.com>

drm/amdgpu: Properly allocate VM invalidate eng v2

v1: Properly allocate TLB invalidation engine to avoid conflict.
v2: Added comments to codes

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 5c583018 20-Sep-2017 Evan Quan <evan.quan@amd.com>

drm/amd/amdgpu: add vega10/raven mmhub/athub golden settings

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1d4e0a8c 15-Sep-2017 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu:hdp flush should be put it initialized

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f053cd47 01-Sep-2017 Tom St Denis <tom.stdenis@amd.com>

drm/amd/amdgpu: Cleanup gmc_v9_0_suspend()

Even though fini returns 0 always it could theoretically
fail in the future. Might as well return it instead of 0.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4d9c333a 01-Sep-2017 Tom St Denis <tom.stdenis@amd.com>

drm/amd/amdgpu: Tidy up gmc_v9_0_hw_init()

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 846347c9 01-Sep-2017 Tom St Denis <tom.stdenis@amd.com>

drm/amd/amdgpu: Tidy up gmc_v9_0_gart_enable()

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ae6d1416 01-Sep-2017 Tom St Denis <tom.stdenis@amd.com>

drm/amd/amdgpu: Simplify gmc_v9_0_vm_fault_interrupt_state()

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c3db7b5a 22-Aug-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move default gart size setting into gmc modules

Move the asic specific code into the IP modules.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# faf50567 22-Aug-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: move default gart size setting into gmc modules

Move the asic specific code into the IP modules.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d07f14be 15-Aug-2017 Roger He <Hongbo.He@amd.com>

drm/amd/amdgpu: expose fragment size as module parameter (v2)

Allow overrides on the command line.

v2: agd: sqaush in spelling fix and bogus default value warning

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e618d306 11-Aug-2017 Roger He <Hongbo.He@amd.com>

drm/amd/amdgpu: store fragment_size in vm_manager

adds fragment_size in the vm_manager structure and
implements hardware setup for it.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Roger He <Hongbo.He@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# edca2d05 24-Jul-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: disable legacy vga features in gmc init

Needs to be done when the MC is set up.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 6f02a696 07-Jul-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: consistent name all GART related parts

Rename symbols from gtt_ to gart_ as appropriate.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# ed21c047 06-Jul-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: remove gtt_base_align handling

Not used any more.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8d6a5230 05-Jul-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/gmc9: get vram width from atom for Raven

Get it from the system info table.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 011d4bbe 26-Jun-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup initializing gtt_size

Stop spreading the code over all GMC generations.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# fd66560b 21-Jun-2017 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: enable 4 level page table on raven (v3)

v1: enable 4 level-page table on raven
v2: add back legacy 2 level page table on raven
v3: set num_level in initial switch statement

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f8386b35 19-Jun-2017 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: add new flag AMD_PG_SUPPORT_MMHUB

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2fcd43ce 19-Jun-2017 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: add mmhub pg init sequence on raven

MMHub Powergating init sequence.

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b9509c80 01-Jun-2017 Huang Rui <ray.huang@amd.com>

drm/amdgpu: update to use RREG32_SOC15/WREG32_SOC15 for gmc9

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 916910ad 30-May-2017 Huang Rui <ray.huang@amd.com>

drm/amdgpu: fix the gart table cleared issue for S3

Something writes over the first 8 MB so reserve this
on vega10 until we root cause it.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 13052be5 31-May-2017 Huang Rui <ray.huang@amd.com>

drm/amdgpu: export mmhub get clockgating into gmc

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d5583d4f 31-May-2017 Huang Rui <ray.huang@amd.com>

drm/amdgpu: export mmhub set clockgating into gmc

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 77f6c763 31-May-2017 Huang Rui <ray.huang@amd.com>

drm/amdgpu: export mmhub sw_init into gmc

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0c8c0847 31-May-2017 Huang Rui <ray.huang@amd.com>

drm/amdgpu: export gfxhub sw_init into gmc

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# b1166325 12-May-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup adjust_mc_addr handling v4

Rename adjust_mc_addr to get_vm_pde and check the address bits in one place.

v2: handle vcn as well, keep setting the valid bit manually,
add a BUG_ON() for GMC v6, v7 and v8 as well.
v3: handle vcn_v1_0_enc_ring_emit_vm_flush as well.
v4: fix the BUG_ON mask for GFX6-8

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 05ec3eda 11-May-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup VM manager init/fini

VM is mandatory for all hw amdgpu supports. So remove the leftovers
to make it optionally.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# aecbe64f 04-May-2017 Chunming Zhou <David1.Zhou@amd.com>

drm/amdgpu: apply nbio7 for Raven (v3)

nbio handles misc bus io operations. Handle
differences between different nbio bus versions.

v2: switch checks from RAVEN to APU (Alex)
squash in raven rev id fetch
squash in fix uninitalized hdp flush reg index for raven
v3: add some missed RAVEN to APU checks (Alex)

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bc099ee9 15-Jan-2017 Chunming Zhou <David1.Zhou@amd.com>

drm/amdgpu/gmc9: change fb offset sequence so that used wider

Initialize the values earlier.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2d8e898e 14-Dec-2016 Chunming Zhou <David1.Zhou@amd.com>

drm/amdgpu/gmc9: set mc vm fb offset for raven

APU fb offset is set by sbios, which is different with DGPU.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e4f3abaa 07-Dec-2016 Chunming Zhou <David1.Zhou@amd.com>

drm/amdgpu: add raven case for gmc9 golden setting

Golden settings for GMC9.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5dd696ae 26-Apr-2017 Trigger Huang <trigger.huang@amd.com>

drm/amdgpu: Bypass GMC/UVD/VCE hw_fini in SR-IOV

On vega10, some hw finish operations should not be applied in SR-IOV
case. This works as workaround to fix multi-VFs reboot/shutdown
issues.

Signed-off-by: Trigger Huang <trigger.huang@amd.com>
Reviewed-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Reviewed-by: Monk Liu <monk.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 32601d48 10-May-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: fix fundamental suspend/resume issue

Reinitializing the VM manager during suspend/resume is a very very bad
idea since all the VMs are still active and kicking.

This can lead to random VM faults after resume when new processes
become the same client ID assigned.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# b3c85a0f 10-May-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: fix fundamental suspend/resume issue

Reinitializing the VM manager during suspend/resume is a very very bad
idea since all the VMs are still active and kicking.

This can lead to random VM faults after resume when new processes
become the same client ID assigned.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org


# 43b9176f 28-Apr-2017 Shaoyun Liu <Shaoyun.Liu@amd.com>

drm/amdgpu: Reserve 0-2 invalidation reg sets for none-amdgpu usages

Firmware used reg set 2 for tlb invalidation. AMDGPU can start from reg
set 3 to avoid the conflict. AMDKFD will use the reg set 0 or 1 when
necesary.

Signed-off-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Reviewws-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 55ed8caf 21-Apr-2017 Chunming Zhou <David1.Zhou@amd.com>

drm/amdgpu: increase gtt size to 3GB by default v2

v2: address Alex's comment, add AMDGPU_DEFAULT_GTT_SIZE_MB.

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 775f55f1 19-Apr-2017 Tom St Denis <tom.stdenis@amd.com>

drm/amd/amdgpu: Print out ring name in dev_info

So it's more obvious which rings are using which INV engines.

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4789c463 31-Mar-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: assign VM invalidation engine manually v2

For Vega10 we have 18 VM invalidation engines for each VMHUB.

Start to assign them manually to the rings.

v2: add a BUG_ON if we use to many engines

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7645670d 06-Apr-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: split VMID management by VMHUB

This way GFX and MM won't fight for VMIDs any more.

Initially disabled since we need to stop flushing all HUBS
at the same time as well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bab4fee7 04-Apr-2017 Junwei Zhang <Jerry.Zhang@amd.com>

drm/amdgpu: set vm size and block size by individual gmc by default (v3)

By default, the value is set by individual gmc.
if a specific value is input, it overrides the global value for all

v2: create helper funcs
v3: update gmc9 APU's num_level athough it may be updated in the future.

Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 03f89feb 04-Apr-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup get_invalidate_req v2

The two hubs are just instances of the same hardware,
so the register bits are identical.

v2: keep the function pointer

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 36b32a68 29-Mar-2017 Zhang, Jerry <Jerry.Zhang@amd.com>

drm/amdgpu: fix vm size and block size for VMPT (v5)

Set reasonable defaults per family.

v2: set both of them in gmc
v3: move vm size and block size in vm manager
v4: squash in warning fix from Alex Xie
v5: squash in min() warning fix

Signed-off-by: Junwei Zhang <Jerry.Zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 11250164 30-Mar-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup VMHUB bit definitions v2

The two hubs are just instances of the same hardware,
so the register bits are identical.

v2: only remove get_vm_protection_bits for now

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# f75e237c 30-Mar-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: move adjust_mc_addr into amdgpu_gart_funcs

We should probably rename amdgpu_gart_funcs sooner or later.

Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 5a9b8e8a 30-Mar-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: fix VMHUB order to match the hardware

Match our defines with what the hw uses.

Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 9ceaeeaf 28-Mar-2017 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Fix Vega10 VM initialization

adev->family is not initialized yet when amdgpu_get_block_size is
called. Use adev->asic_type instead.

Minimum VM size is 512GB, not 256GB, for a single page table entry
in the root page table.

gmc_v9_0_vm_init is called after adev->vm_manager.max_pfn is
initialized. Move the minimum VM-size enforcement ahead of max_pfn
initializtion. Cast to 64-bit before the left-shift.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 4d6cbde3 28-Mar-2017 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Clean up GFX 9 VM fault messages

Clean up the VM fault message format and use rate-limiting similar
to other ASICs.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d7c434d3 28-Mar-2017 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Register UTCL2 as a source of VM faults

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# eeb2487d 23-Mar-2017 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu:fix missing programing critical registers

those MC_VM registers won't be programed by VBIOS in VF
so driver is responsible to programe them.

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 79a0c465 22-Mar-2017 Monk Liu <Monk.Liu@amd.com>

drm/amdgpu:fix gmc_v9 vm fault process for SRIOV

for SRIOV we cannot use access register when in IRQ routine
with regular KIQ method

Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 2de6a7c5 26-Mar-2017 Chunming Zhou <David1.Zhou@amd.com>

drm/amdgpu: enable four level VMPT for gmc9

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 36f8c1f9 26-Mar-2017 Chunming Zhou <David1.Zhou@amd.com>

drm/amdgpu: adapt vm size for multi vmpt

Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8437a097 17-Oct-2016 Christian König <christian.koenig@amd.com>

drm/amdgpu: add num_level to the VM manager

Needs to be filled with handling.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c7a7266b 28-Feb-2017 Xiangliang Yu <Xiangliang.Yu@amd.com>

drm/amdgpu/gmc9: no need use kiq in vega10 tlb flush

two reasons:
1. there is a spinlock around;
2. vm register is pf/vf copy, vf can access via mmio safely.

Signed-off-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Monk Liu <Monk.Liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e60f8db5 09-Mar-2017 Alex Xie <AlexBin.Xie@amd.com>

drm/amdgpu: Add GMC 9.0 support (v2)

On SOC-15 parts, the GMC (Graphics Memory Controller) consists
of two hubs: GFX (graphics and compute) and MM (sdma, uvd, vce).

v2: drop sdma from Makefile, fix duplicate return statement.

Signed-off-by: Alex Xie <AlexBin.Xie@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>