#
b8f67b9d |
|
18-Jan-2024 |
Shashank Sharma <shashank.sharma@amd.com> |
drm/amdgpu: change vm->task_info handling This patch changes the handling and lifecycle of vm->task_info object. The major changes are: - vm->task_info is a dynamically allocated ptr now, and its uasge is reference counted. - introducing two new helper funcs for task_info lifecycle management - amdgpu_vm_get_task_info: reference counts up task_info before returning this info - amdgpu_vm_put_task_info: reference counts down task_info - last put to task_info() frees task_info from the vm. This patch also does logistical changes required for existing usage of vm->task_info. V2: Do not block all the prints when task_info not found (Felix) V3: Fixed review comments from Felix - Fix wrong indentation - No debug message for -ENOMEM - Add NULL check for task_info - Do not duplicate the debug messages (ti vs no ti) - Get first reference of task_info in vm_init(), put last in vm_fini() V4: Fixed review comments from Felix - fix double reference increment in create_task_info - change amdgpu_vm_get_task_info_pasid - additional changes in amdgpu_gem.c while porting Cc: Christian Koenig <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Shashank Sharma <shashank.sharma@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
d6e924ad |
|
09-Aug-2022 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Update SMI events for GFX9.4.3 On GFX 9.4.3, there can be multiple KFD nodes. As a result, SMI events for SVM, queue evict/restore should be raised for each node independently. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
74c5b85d |
|
09-May-2022 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Add spatial partitioning support in KFD This patch introduces multi-partition support in KFD. This patch includes: - Support for maximum 8 spatial partitions in KFD. - Initialize one HIQ per partition. - Management of VMID range depending on partition mode. - Management of doorbell aperture space between all partitions. - Each partition does its own queue management, interrupt handling, SMI event reporting. - IOMMU, if enabled with multiple partitions, will only work on first partition. - SPM is only supported on the first partition. - Currently, there is no support for resetting individual partitions. All partitions will reset together. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
8dc1db31 |
|
14-Sep-2022 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Introduce kfd_node struct (v5) Introduce a new structure, kfd_node, which will now represent a compute node. kfd_node is carved out of kfd_dev structure. kfd_dev struct now will become the parent of kfd_node, and will store common resources such as doorbells, GTT sub-alloctor etc. kfd_node struct will store all resources specific to a compute node, such as device queue manager, interrupt handling etc. This is the first step in adding compute partition support in KFD. v2: introduce kfd_node struct to gc v11 (Hawking) v3: make reference to kfd_dev struct through kfd_node (Morris) v4: use kfd_node instead for kfd isr/mqd functions (Morris) v5: rebase (Alex) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
414e9f52 |
|
08-Apr-2022 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Asynchronously free smi_client The synchronize_rcu may take several ms, which noticeably slows down applications close SMI event handle. Use call_rcu to free client->fifo and client asynchronously and eliminate the synchronize_rcu call in the user thread. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
46ae2af9 |
|
20-Jan-2022 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Add unmap from GPU SMI event SVM range unmapped from GPUs when range is unmapped from CPU, or with xnack on from MMU notifier when range is evicted or migrated. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
c7f21978 |
|
13-Jan-2022 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Add user queue eviction restore SMI event Output user queue eviction and restore event. User queue eviction may be triggered by svm or userptr MMU notifier, TTM eviction, device suspend and CRIU checkpoint and restore. User queue restore may be rescheduled if eviction happens again while restore. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
acac270d |
|
13-Jan-2022 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Add migration SMI event For migration start and end event, output timestamp when migration starts, ends, svm range address and size, GPU id of migration source and destination and svm range attributes, Migration trigger could be prefetch, CPU or GPU page fault and TTM eviction. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
e0f1e65b |
|
13-Jan-2022 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Add GPU recoverable fault SMI event Use ktime_get_boottime_ns() as timestamp to correlate with other APIs. Output timestamp when GPU recoverable fault starts and ends to recover the fault, if migration happened or only GPU page table is updated to recover, fault address, if read or write fault. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
163a5a58 |
|
13-Jan-2022 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Enable per process SMI event Process receive event from same process by default. Add a flag to be able to receive event from all processes, this requires super user permission. Event using pid 0 to send the event to all processes, to keep the default behavior of existing SMI events. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
e4542269 |
|
31-Mar-2022 |
Lee Jones <lee.jones@linaro.org> |
drm/amdkfd: Create file descriptor after client is added to smi_clients list This ensures userspace cannot prematurely clean-up the client before it is fully initialised which has been proven to cause issues in the past. Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
e79a2398 |
|
31-Mar-2022 |
Lee Jones <lee.jones@linaro.org> |
drm/amdkfd: Create file descriptor after client is added to smi_clients list This ensures userspace cannot prematurely clean-up the client before it is fully initialised which has been proven to cause issues in the past. Cc: Felix Kuehling <Felix.Kuehling@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
|
#
53b97af4 |
|
04-Mar-2022 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Add format attribute to kfd_smi_event_add To enable compiler type-checked against the format string in callers. All warnings (new ones prefixed by >>): >> warning: function 'kfd_smi_event_add' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format] Fixes: d58b8a99cbb8 ("drm/amdkfd: Add SMI add event helper") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
d58b8a99 |
|
24-Feb-2022 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Add SMI add event helper To remove duplicate code, unify event message format and simplify new event add in the following patches. Use KFD_SMI_EVENT_MSG_SIZE to define msg size, the same size will be used in user space to alloc the msg receive buffer. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
38abd56b |
|
16-Dec-2021 |
Philip Yang <Philip.Yang@amd.com> |
drm/amdkfd: Correct SMI event read size sizeof(buf) is 8 bytes because it is defined as unsigned char *buf, each SMI event read only copy max 8 bytes to user buffer. Correct this by using the buf allocate size. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
d87f36a0 |
|
10-Feb-2022 |
Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> |
drm/amdkfd: update SPDX license header Update the SPDX License header for all the KFD files. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
bc143d8b |
|
21-Nov-2021 |
Evan Quan <evan.quan@amd.com> |
drm/amd/pm: do not expose implementation details to other blocks out of power Those implementation details(whether swsmu supported, some ppt_funcs supported, accessing internal statistics ...)should be kept internally. It's not a good practice and even error prone to expose implementation details. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
56c5977e |
|
19-Oct-2021 |
Graham Sider <Graham.Sider@amd.com> |
drm/amdkfd: replace/remove remaining kgd_dev references Remove get_amdgpu_device and other remaining kgd_dev references aside from declaration/kfd struct entry and initialization. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
410e302e |
|
06-Jul-2021 |
Graham Sider <Graham.Sider@amd.com> |
drm/amdkfd: Update SMI throttle event bitmask Update Arcturus/Aldebaran thermal throttle SMI event path to use ASIC-independent throttler bits when logging. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
ccc43430 |
|
31-Mar-2021 |
Bernard Zhao <bernard@vivo.com> |
drm/amd: use kmalloc_array over kmalloc with multiply Fix patch check warning: WARNING: Prefer kmalloc_array over kmalloc with multiply + buf = kmalloc(MAX_KFIFO_SIZE * sizeof(*buf), GFP_KERNEL); Signed-off-by: Bernard Zhao <bernard@vivo.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
55977744 |
|
28-Aug-2020 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Add GPU reset SMI event Add support for reporting GPU reset events through SMI. KFD would report both pre and post GPU reset events. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
522ec6e0 |
|
30-Jul-2020 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Replace bitmask with event idx in SMI event msg Event bitmask is a 64-bit mask with only 1 bit set. Sending this event bitmask in KFD SMI event message is both wasteful of memory and potentially limiting to only 64 events. Instead send event index in SMI event message. Please note this change does not break the ABI for the two event types defined so far. The new index is identical to the mask used before. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
2c2b0d88 |
|
23-Jul-2020 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Add thermal throttling SMI event Add support for reporting thermal throttling events through SMI. Also, add a counter to count the number of throttling interrupts observed and report the count in the SMI event message. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
6e14adea |
|
19-May-2020 |
Aurabindo Pillai <aurabindo.pillai@amd.com> |
drm/amd/amdkfd: Fix large framesize for kfd_smi_ev_read() The buffer allocated is of 1024 bytes. Allocate this from heap instead of stack. Also remove check for stack size since we're allocating from heap Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
938a0650 |
|
13-May-2020 |
Amber Lin <Amber.Lin@amd.com> |
drm/amdkfd: Provide SMI events watch When the compute is malfunctioning or performance drops, the system admin will use SMI (System Management Interface) tool to monitor/diagnostic what went wrong. This patch provides an event watch interface for the user space to register devices and subscribe events they are interested. After registered, the user can use annoymous file descriptor's poll function with wait-time specified and wait for events to happen. Once an event happens, the user can use read() to retrieve information related to the event. VM fault event is done in this patch. v2: - remove UNREGISTER and add event ENABLE/DISABLE - correct kfifo usage - move event message API to kfd_ioctl.h v3: send the event msg in text than in binary v4: support multiple clients v5: move events enablement from ioctl to fd write v6: sparse fix Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|