History log of /linux-master/drivers/gpu/drm/amd/amdgpu/amdgpu_ih.h
Revision Date Author Comments
# bcfb9cee 15-Sep-2023 Philip Yang <Philip.Yang@amd.com>

drm/amdgpu: Increase IH soft ring size for GFX v9.4.3 dGPU

On GFX v9.4.3 dGPU, applications have random timeout failure when XNACK
on, dmesg log has "amdgpu: IH soft ring buffer overflow 0x900, 0x900",
because dGPU mode has 272 cam entries. After increasing IH soft ring
to 512 entries, no more IH soft ring overflow message and application
passed.

Fixes: bf80d34b6c58 ("drm/amdgpu: Increase soft IH ring size")
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# bf80d34b 07-Jul-2023 Philip Yang <Philip.Yang@amd.com>

drm/amdgpu: Increase soft IH ring size

Retry faults are delegated to soft IH ring and then processed by
deferred worker. Current soft IH ring size PAGE_SIZE can store 128
entries, which may overflow and drop retry faults, causes HW stucks
because the retry fault is not recovered.

Increase soft IH ring size to 8KB, enough to store 256 CAM entries
because we clear the CAM entry after handling the retry fault from soft
ring.

Define macro IH_RING_SIZE and IH_SW_RING_SIZE to remove duplicate
constant.

Show warning message if soft IH ring overflows with CAM enabled because
this should not happen.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3c2d6ea2 18-Nov-2021 Philip Yang <Philip.Yang@amd.com>

drm/amdgpu: handle IH ring1 overflow

IH ring1 is used to process GPU retry fault, overflow is enabled to
drain retry fault because we want receive other interrupts while
handling retry fault to recover range. There is no overflow flag set
when wptr pass rptr. Use timestamp of rptr and wptr to handle overflow
and drain retry fault.

If fault timestamp goes backward, the fault is filtered and should not
be processed. Drain fault is finished if processed_timestamp is equal to
or larger than checkpoint timestamp.

Add amdgpu_ih_functions interface decode_iv_ts for different chips to
get timestamp from IV entry with different iv size and timestamp offset.
amdgpu_ih_decode_iv_ts_helper is used for vega10, vega20, navi10.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d423f551 12-Mar-2021 Christian König <christian.koenig@amd.com>

drm/amdgpu: nuke the ih reentrant lock

Interrupts on are non-reentrant on linux. This is just an ancient
leftover from radeon where irq processing was kicked of from different
places.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3f1d1eb2 23-Feb-2021 Jonathan Kim <jonathan.kim@amd.com>

drm/amdgpu: add ih waiter on process until checkpoint

Add IH function to allow caller to wait until ring entries are processed
until the checkpoint write pointer.

This will be primarily used by HMM to drain pending page fault interrupts
before memory unmap to prevent HMM from handling stale interrupts.

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 78bd101c 25-Nov-2020 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: add a helper function to decode iv

since from soc15, all the chips share the same
iv format. create a common helper to decode iv

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3c06aaff 01-Dec-2020 Hawking Zhang <Hawking.Zhang@amd.com>

drm/amdgpu: add amdgpu_ih_regs structure

amdgpu_ih_regs holds all the registers for
an ih ring

Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Dennis Li <Dennis.Li@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 26f32a37 31-Oct-2020 Christian König <christian.koenig@amd.com>

drm/amdgpu: add infrastructure for soft IH ring

Add a soft IH ring implementation similar to the hardware IH1/2.

This can be used if the hardware delegation of interrupts to IH1/2
doesn't work for some reason.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8c65fe5f 05-Mar-2019 Christian König <christian.koenig@amd.com>

drm/amdgpu: limit the number of IVs processed at once

Only process a maximum of 32 IVs before writing back the RPTR. This improves
hw handling when we get close to an overflow in the ring buffer.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# e2fb6e0a 09-Jan-2019 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup amdgpu_ih_process a bit more

Remove the callback and call the dispatcher directly.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d81f78b4 18-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: simplify IH programming

Calculate all the addresses and pointers in amdgpu_ih.c

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 8bb9eb48 17-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: add IH ring to ih_get_wptr/ih_set_rptr v2

Let's start to support multiple rings.

v2: decode IV is needed as well

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 22666cc1 26-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: move IV prescreening into the GMC code

The GMC/VM subsystem is causing the faults, so move the handling here as
well.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1ffdeca6 17-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: move more defines into amdgpu_irq.h

Everything that isn't related to the IH ring.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 1f896946 17-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: move more interrupt processing into amdgpu_irq.c

Add a callback to amdgpu_ih_process to remove most of the IV logic.

Signed-off-by: Christian König <christian.koenig@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 425c3143 16-Sep-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: cleanup amdgpu_ih.c

Cleanup amdgpu_ih.c to be able to handle multiple interrupt rings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 240cd9a6 05-Sep-2018 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Move fault hash table to amdgpu vm

In stead of share one fault hash table per device, make it
per vm. This can avoid inter-process lock issue when fault
hash table is full.

Change-Id: I5d1281b7c41eddc8e26113e010516557588d3708
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Christian Konig <Christian.Koenig@amd.com>
Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# aa47d117 02-Aug-2018 Huang Rui <ray.huang@amd.com>

drm/amdgpu: move ih definitions into amdgpu_ih header

Demangle amdgpu.h

Signed-off-by: Huang Rui <ray.huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3760f76c 08-Mar-2018 Oak Zeng <Oak.Zeng@amd.com>

drm/amdgpu: Move IH clientid defs to separate file

This is preparation for sharing client ID definitions
between amdgpu and amdkfd

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 3816e42f 09-Jan-2018 Christian König <christian.koenig@amd.com>

drm/amdgpu: rename pas_id to pasid

sed -i "s/pas_id/pasid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/pas_id/pasid/g" drivers/gpu/drm/amd/amdgpu/*.h

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# c4f46f22 18-Dec-2017 Christian König <christian.koenig@amd.com>

drm/amdgpu: rename vm_id to vmid

sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.c
sed -i "s/vm_id/vmid/g" drivers/gpu/drm/amd/amdgpu/*.h

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# a2f14820 26-Aug-2017 Felix Kuehling <Felix.Kuehling@amd.com>

drm/amdgpu: Track pending retry faults in IH and VM (v2)

IH tracks pending retry faults in a hash table for fast lookup in
interrupt context. Each VM has a short FIFO of pending VM faults for
processing in a bottom half.

The IH prescreening stage adds retry faults and filters out repeated
retry interrupts to minimize the impact of interrupt storms.

It's the VM's responsibility remove pending faults once they are
handled. For now this is only done when the VM is destroyed.

v2:
- Made the hash table smaller and the FIFO longer. I never want the
FIFO to fill up, because that would make prescreen take longer.
128 pending page faults should be enough to keep migrations busy.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Acked-by: Christian König <christian.koenig@amd.com> (v1)
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 88b5af70 28-Dec-2016 Leo Liu <leo.liu@amd.com>

drm/amdgpu: add vcn ip block functions (v2)

Fill in the core VCN 1.0 setup functionality.

v2: squash in fixup (Alex)

Signed-off-by: Leo Liu <leo.liu@amd.com>
Acked-by: Chunming Zhou <david1.zhou@amd.com>
Acked-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 614dea31 03-Mar-2017 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: update IH IV ring entry for soc-15

Reflect the new format on soc-15 asics.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 70170d14 09-Mar-2017 ken <Qingqing.Wang@amd.com>

drm/amdgpu: add clinetid definition for vega10

Signed-off-by: ken <Qingqing.Wang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 7ccf5aa8 29-Nov-2016 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu/ih: store the full context id

The contextID field (formerly known as src_data) of the IH
vector stores client specific information about an interrupt.
It was expanded from 32 bits to 128 on newer asics. Expand the
src_id field to handle this.

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d766e6a3 29-Mar-2016 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: switch ih handling to two levels (v3)

Newer asics have a two levels of irq ids now:
client id - the IP
src id - the interrupt src within the IP

v2: integrated Christian's comments.
v3: fix rebase fail in SI and CIK

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Ken Wang <Qingqing.Wang@amd.com>
Reviewed-by: Ken Wang <Qingqing.Wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# 0cf3be21 28-Jul-2015 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: Implement irq interfaces for CGS

This implements the irq src registrar.

Reviewed-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>


# d38ceaf9 20-Apr-2015 Alex Deucher <alexander.deucher@amd.com>

drm/amdgpu: add core driver (v4)

This adds the non-asic specific core driver code.

v2: remove extra kconfig option
v3: implement minor fixes from Fengguang Wu
v4: fix cast in amdgpu_ucode.c

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Jammy Zhou <Jammy.Zhou@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>