#
c2d2588c |
|
07-Apr-2022 |
Jonathan Kim <jonathan.kim@amd.com> |
drm/amdkfd: add send exception operation Add a debug operation that allows the debugger to send an exception directly to runtime through a payload address. For memory violations, normal vmfault signals will be applied to notify runtime instead after passing in the saved exception data when a memory violation was raised to the debugger. For runtime exceptions, this will unblock the runtime enable function which will be explained and implemented in a follow up patch. Signed-off-by: Jonathan Kim <jonathan.kim@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
8dc1db31 |
|
14-Sep-2022 |
Mukul Joshi <mukul.joshi@amd.com> |
drm/amdkfd: Introduce kfd_node struct (v5) Introduce a new structure, kfd_node, which will now represent a compute node. kfd_node is carved out of kfd_dev structure. kfd_dev struct now will become the parent of kfd_node, and will store common resources such as doorbells, GTT sub-alloctor etc. kfd_node struct will store all resources specific to a compute node, such as device queue manager, interrupt handling etc. This is the first step in adding compute partition support in KFD. v2: introduce kfd_node struct to gc v11 (Hawking) v3: make reference to kfd_dev struct through kfd_node (Morris) v4: use kfd_node instead for kfd isr/mqd functions (Morris) v5: rebase (Alex) Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Tested-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Morris Zhang <Shiwu.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
03e5b167 |
|
06-Feb-2022 |
Tao Zhou <tao.zhou1@amd.com> |
drm/amdkfd: rename kfd_process_vm_fault to kfd_dqm_evict_pasid As the function is used in more different cases, use a more general name. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
7eb0502a |
|
10-Nov-2021 |
Graham Sider <Graham.Sider@amd.com> |
drm/amdkfd: replace asic_family with asic_type asic_family was a duplicate of asic_type, both of type amd_asic_type. Replace all instances of device_info->asic_family with adev->asic_type and remove asic_family from device_info. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
dff63da9 |
|
19-Oct-2021 |
Graham Sider <Graham.Sider@amd.com> |
drm/amdkfd: replace kgd_dev in gpuvm amdgpu_amdkfd funcs Modified definitions: - amdgpu_amdkfd_gpuvm_acquire_process_vm - amdgpu_amdkfd_gpuvm_release_process_vm - amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu - amdgpu_amdkfd_gpuvm_free_memory_of_gpu - amdgpu_amdkfd_gpuvm_map_memory_to_gpu - amdgpu_amdkfd_gpuvm_unmap_memory_from_gpu - amdgpu_amdkfd_gpuvm_sync_memory - amdgpu_amdkfd_gpuvm_map_gtt_bo_to_kernel - amdgpu_amdkfd_gpuvm_unmap_gtt_bo_from_kernel - amdgpu_amdkfd_gpuvm_get_vm_fault_info - amdgpu_amdkfd_gpuvm_import_dmabuf - amdgpu_amdkfd_get_tile_config Removed: - get_amdgpu_device Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
3356c38d |
|
14-Oct-2021 |
Graham Sider <Graham.Sider@amd.com> |
drm/amdkfd: replace kgd_dev in various kfd2kgd funcs Modified definitions: - program_sh_mem_settings - set_pasid_vmid_mapping - init_interrupts - address_watch_disable - address_watch_execute - wave_control_execute - address_watch_get_offset - get_atc_vmid_pasid_mapping_info - set_scratch_backing_va - set_vm_context_page_table_base - read_vmid_from_vmfault_reg - get_cu_occupancy - program_trap_handler_settings Signed-off-by: Graham Sider <Graham.Sider@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
6d909c5d |
|
22-Jun-2020 |
Oak Zeng <Oak.Zeng@amd.com> |
drm/amdkfd: Add kernel parameter to stop queue eviction on vm fault This is to keep wavefront context for debug purpose Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
86b6624a |
|
21-Oct-2020 |
Sumera Priyadarsini <sylphrenadin@gmail.com> |
drm/amdgpu: Return boolean types instead of integer values Return statements for functions returning bool should use truth and false instead of 1 and 0 respectively. Modify cik_event_interrupt.c to return false instead of 0. Issue found with Coccinelle. Signed-off-by: Sumera Priyadarsini <sylphrenadin@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
c7b6bac9 |
|
15-Sep-2020 |
Fenghua Yu <fenghua.yu@intel.com> |
drm, iommu: Change type of pasid to u32 PASID is defined as a few different types in iommu including "int", "u32", and "unsigned int". To be consistent and to match with uapi definitions, define PASID and its variations (e.g. max PASID) as "u32". "u32" is also shorter and a little more explicit than "unsigned int". No PASID type change in uapi although it defines PASID as __u64 in some places. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Fenghua Yu <fenghua.yu@intel.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Lu Baolu <baolu.lu@linux.intel.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Joerg Roedel <jroedel@suse.de> Link: https://lkml.kernel.org/r/1600187413-163670-2-git-send-email-fenghua.yu@intel.com
|
#
938a0650 |
|
13-May-2020 |
Amber Lin <Amber.Lin@amd.com> |
drm/amdkfd: Provide SMI events watch When the compute is malfunctioning or performance drops, the system admin will use SMI (System Management Interface) tool to monitor/diagnostic what went wrong. This patch provides an event watch interface for the user space to register devices and subscribe events they are interested. After registered, the user can use annoymous file descriptor's poll function with wait-time specified and wait for events to happen. Once an event happens, the user can use read() to retrieve information related to the event. VM fault event is done in this patch. v2: - remove UNREGISTER and add event ENABLE/DISABLE - correct kfifo usage - move event message API to kfd_ioctl.h v3: send the event msg in text than in binary v4: support multiple clients v5: move events enablement from ioctl to fd write v6: sparse fix Signed-off-by: Amber Lin <Amber.Lin@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
56fc40ab |
|
25-Sep-2019 |
Yong Zhao <Yong.Zhao@amd.com> |
drm/amdkfd: Eliminate get_atc_vmid_pasid_mapping_valid get_atc_vmid_pasid_mapping_valid() is very similar to get_atc_vmid_pasid_mapping_pasid(), so they can be merged into a new function get_atc_vmid_pasid_mapping_info() to reduce register access times. More importantly, getting the PASID and the valid bit atomically with a single read fixes some potential race conditions where the mapping changes between the two reads. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
5b87245f |
|
16-Oct-2018 |
Amber Lin <Amber.Lin@amd.com> |
drm/amdkfd: Simplify kfd2kgd interface After amdkfd module is merged into amdgpu, KFD can call amdgpu directly and no longer needs to use the function pointer. Replace those function pointers with functions if they are not ASIC dependent. Signed-off-by: Amber Lin <Amber.Lin@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
#
58e69886 |
|
11-Jul-2018 |
Lan Xiao <Lan.Xiao@amd.com> |
drm/amdkfd: fix zero reading of VMID and PASID for Hawaii Upon VM Fault, the VMID and PASID written by HW are zeros in Hawaii. Instead of reading from ih_ring_entry, read directly from the registers. This workaround fix the soft hang issues caused by mishandled VM Fault in Hawaii. Signed-off-by: Lan Xiao <Lan.Xiao@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
#
2640c3fa |
|
11-Jul-2018 |
shaoyunl <Shaoyun.Liu@amd.com> |
drm/amdkfd: Handle VM faults in KFD 1. Pre-GFX9 the amdgpu ISR saves the vm-fault status and address per per-vmid. amdkfd needs to get the information from amdgpu through the new get_vm_fault_info interface. On GFX9 and later, all the required information is in the IH ring 2. amdkfd unmaps all queues from the faulting process and create new run-list without the guilty process 3. amdkfd notifies the runtime of the vm fault trap via EVENT_TYPE_MEMORY Signed-off-by: shaoyun liu <shaoyun.liu@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
#
c129db12 |
|
01-May-2018 |
Felix Kuehling <Felix.Kuehling@amd.com> |
drm/amdkfd: Add sanity checks in IRQ handlers Only accept interrupts from KFD VMIDs. Just checking for a PASID may not be enough because amdgpu started using PASIDs to map VM faults to processes. Warn if an IRQ doesn't have a valid PASID (indicating a firmware bug). Suggested-by: Shaoyun Liu <Shaoyun.Liu@amd.com> Suggested-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
#
3f04f961 |
|
27-Oct-2017 |
Felix Kuehling <Felix.Kuehling@amd.com> |
drm/amdkfd: Use IH context ID for signal lookup This speeds up signal lookup when the IH ring entry includes a valid context ID or partial context ID. Only if the context ID is found to be invalid, fall back to an exhaustive search of all signaled events. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
#
66b783b4 |
|
27-Oct-2017 |
Besar Wicaksono <besar.wicaksono@amd.com> |
drm/amdkfd: Add SDMA trap src id to the KFD isr wanted list This enables SDMA signalling with event interrupt. Signed-off-by: Besar Wicaksono <Besar.Wicaksono@amd.com> Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
#
930c5ff4 |
|
25-Nov-2014 |
Alexey Skidanov <alexey.skidanov@gmail.com> |
drm/amdkfd: Add bad opcode exception handling Signed-off-by: Alexey Skidanov <alexey.skidanov@gmail.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|
#
f3a39818 |
|
09-May-2015 |
Andrew Lewycky <Andrew.Lewycky@amd.com> |
drm/amdkfd: Add the events module This patch adds the events module (kfd_events.c) and the interrupt handle module for Kaveri (cik_event_interrupt.c). The patch updates the interrupt_is_wanted(), so that it now calls the interrupt isr function specific for the device that received the interrupt. That function(implemented in cik_event_interrupt.c) returns whether this interrupt is of interest to us or not. The patch also updates the interrupt_wq(), so that it now calls the device's specific wq function, which checks the interrupt source and tries to signal relevant events. v2: Increase limit of signal events to 4096 per process Remove bitfields from struct cik_ih_ring_entry Rename radeon_kfd_event_mmap to kfd_event_mmap Add debug prints to allocate_free_slot and allocate_signal_page Make allocate_event_notification_slot return a correct value Add warning prints to create_signal_event Remove error print from IOCTL path Reformatted debug prints in kfd_event_mmap Map correct size (as received from mmap) in kfd_event_mmap v3: Reduce limit of signal events back to 256 per process Fix allocation of kernel memory for signal events Signed-off-by: Andrew Lewycky <Andrew.Lewycky@amd.com> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
|