Cross Reference: /linux-master/drivers/gpu/drm/xe/xe_gt_tlb

History log of /linux-master/drivers/gpu/drm/xe/xe_gt_tlb_invalidation.c
Revision	Date	Author	Comments
# fd1c8085	27-Mar-2024	Thomas Hellström <thomas.hellstrom@linux.intel.com>	drm/xe: Make TLB invalidation fences unordered They can actually complete out-of-order, so allocate a unique fence context for each fence. Fixes: 5387e865d90e ("drm/xe: Add TLB invalidation fence after rebinds issued from execs") Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240327091136.3271-4-thomas.hellstrom@linux.intel.com (cherry picked from commit 0453f1757501df2e82b66b3183a24bba5a6f8fa3) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
# 247f2ee4	12-Feb-2024	Thomas Hellström <thomas.hellstrom@linux.intel.com>	drm/xe: Fix a missing argument to drm_err_printer The indicated commit below added a device argument to the function, but there was a call in the xe driver that was not properly changed. Fixes: 5e0c04c8c40b ("drm/print: make drm_err_printer() device specific by using drm_err()") Cc: Jani Nikula <jani.nikula@intel.com> Cc: Luca Coelho <luciano.coelho@intel.com> Cc: Maxime Ripard <mripard@kernel.org> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240212103833.138263-1-thomas.hellstrom@linux.intel.com
# e8b9b309	18-Dec-2023	Michal Wajdeczko <michal.wajdeczko@intel.com>	drm/xe: Report TLB timeout using GT oriented functions We track TLB invalidation seqno per GT and we have GT oriented message helpers, so it's better to use GT oriented log functions. Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20231218190629.502-3-michal.wajdeczko@intel.com Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
# ccff0b21	22-Feb-2024	Matthew Brost <matthew.brost@intel.com>	drm/xe: Don't support execlists in xe_gt_tlb_invalidation layer The xe_gt_tlb_invalidation layer implements TLB invalidations for a GuC backend. Simply return if in execlists mode. A follow up may properly implement the xe_gt_tlb_invalidation layer for both GuC and execlists. Fixes: a9351846d945 ("drm/xe: Break of TLB invalidation into its own file") Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240222232021.3911545-4-matthew.brost@intel.com (cherry picked from commit a9e483dda3efa5b9aae5d9eef94d2c3a878d9bea) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
# b67cb798	28-Nov-2023	Michal Wajdeczko <michal.wajdeczko@intel.com>	drm/xe/guc: Include only required GuC ABI headers On i915 we were adding new GuC ABI headers directly to guc_fwif.h file since we were replacing old definitions from that file. On xe driver we could do more and better by including ABI headers only in files that need those definitions. Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/741 Cc: Jani Nikula <jani.nikula@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://lore.kernel.org/r/20231128203203.1147-3-michal.wajdeczko@intel.com Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 3d78923b	15-Nov-2023	Michal Wajdeczko <michal.wajdeczko@intel.com>	drm/xe/guc: Promote guc_to_gt/xe helpers to .h Duplicating these helpers in almost every .c file is a bad idea. Define them as inlines in .h file to allow proper reuse. Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# ebb00b28	31-Oct-2023	Pallavi Mishra <pallavi.mishra@intel.com>	drm/xe: Dump CTB during TLB timeout Print CTB info during TLB invalidation timeout event. Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Pallavi Mishra <pallavi.mishra@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# c73acc1e	12-Sep-2023	Francois Dugast <francois.dugast@intel.com>	drm/xe: Use Xe assert macros instead of XE_WARN_ON macro The XE_WARN_ON macro maps to WARN_ON which is not justified in many cases where only a simple debug check is needed. Replace the use of the XE_WARN_ON macro with the new xe_assert macros which relies on drm_*. This takes a struct drm_device argument, which is one of the main changes in this commit. The other main change is that the condition is reversed, as with XE_WARN_ON a message is displayed if the condition is true, whereas with xe_assert it is if the condition is false. v2: - Rebase - Keep WARN splats in xe_wopcm.c (Matt Roper) v3: - Rebase Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 99fea682	27-Jul-2023	Francois Dugast <francois.dugast@intel.com>	drm/xe: Prefer WARN() over BUG() to avoid crashing the kernel Replace calls to XE_BUG_ON() with calls XE_WARN_ON() which in turn calls WARN() instead of BUG(). BUG() crashes the kernel and should only be used when it is absolutely unavoidable in case of catastrophic and unrecoverable failures, which is not the case here. Signed-off-by: Francois Dugast <francois.dugast@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 7d623575	19-Jul-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe: drop xe_device_mem_access_get() from invalidation_vma Lockdep gives the following splat: [ 594.158863] ffff888140da53f0 (&vm->userptr.notifier_lock){++++}-{3:3}, at: vma_userptr_invalidate+0xeb/0x330 [xe] [ 594.158921] but task is already holding lock: [ 594.158926] ffffffff82761940 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}, at: unmap_vmas+0x0/0x1c0 [ 594.158941] which lock already depends on the new lock. [ 594.158947] the existing dependency chain (in reverse order) is: [ 594.158953] -> #5 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}: [ 594.158961] fs_reclaim_acquire+0x68/0xd0 [ 594.158969] __kmem_cache_alloc_node+0x2c/0x1b0 [ 594.158975] kmalloc_node_trace+0x1d/0xb0 [ 594.158983] alloc_worker+0x18/0x50 [ 594.158989] init_rescuer.part.0+0x13/0xa0 [ 594.158995] workqueue_init+0xdf/0x210 [ 594.159001] kernel_init_freeable+0x5c/0x2f0 [ 594.159009] kernel_init+0x11/0x1a0 [ 594.159017] ret_from_fork+0x29/0x50 [ 594.159023] -> #4 (fs_reclaim){+.+.}-{0:0}: [ 594.159031] fs_reclaim_acquire+0xa0/0xd0 [ 594.159037] __kmem_cache_alloc_node+0x2c/0x1b0 [ 594.159042] kmalloc_trace+0x20/0xb0 [ 594.159048] acpi_device_add+0x25a/0x3f0 [ 594.159056] acpi_add_single_object+0x387/0x750 [ 594.159063] acpi_bus_check_add+0x108/0x280 [ 594.159069] acpi_bus_scan+0x34/0xf0 [ 594.159075] acpi_scan_init+0xed/0x2b0 [ 594.159082] acpi_init+0x21e/0x520 [ 594.159087] do_one_initcall+0x53/0x260 [ 594.159092] kernel_init_freeable+0x18a/0x2f0 [ 594.159099] kernel_init+0x11/0x1a0 [ 594.159105] ret_from_fork+0x29/0x50 [ 594.159110] -> #3 (acpi_device_lock){+.+.}-{3:3}: [ 594.159117] __mutex_lock+0x95/0xd10 [ 594.159122] acpi_enable_wakeup_device_power+0x30/0x120 [ 594.159130] __acpi_device_wakeup_enable+0x34/0x110 [ 594.159138] acpi_pm_set_device_wakeup+0x55/0x140 [ 594.159143] __pci_enable_wake+0x56/0xb0 [ 594.159150] pci_finish_runtime_suspend+0x35/0x80 [ 594.159157] pci_pm_runtime_suspend+0xb5/0x1a0 [ 594.159162] __rpm_callback+0x3c/0x110 [ 594.159170] rpm_callback+0x58/0x70 [ 594.159176] rpm_suspend+0x15c/0x6f0 [ 594.159182] pm_runtime_work+0x9b/0xb0 [ 594.159188] process_one_work+0x263/0x520 [ 594.159195] worker_thread+0x4d/0x3b0 [ 594.159200] kthread+0xeb/0x120 [ 594.159206] ret_from_fork+0x29/0x50 [ 594.159211] -> #2 (acpi_wakeup_lock){+.+.}-{3:3}: [ 594.159218] __mutex_lock+0x95/0xd10 [ 594.159223] acpi_pm_set_device_wakeup+0x7a/0x140 [ 594.159228] __pci_enable_wake+0x77/0xb0 [ 594.159234] pci_pm_runtime_resume+0x70/0xd0 [ 594.159240] __rpm_callback+0x3c/0x110 [ 594.159246] rpm_callback+0x58/0x70 [ 594.159252] rpm_resume+0x50d/0x7a0 [ 594.159258] rpm_resume+0x267/0x7a0 [ 594.159264] __pm_runtime_resume+0x45/0x90 [ 594.159270] xe_pm_runtime_resume_and_get+0x12/0x50 [xe] [ 594.159314] xe_device_mem_access_get+0x97/0xc0 [xe] [ 594.159346] hw_engines+0x65/0xf0 [xe] [ 594.159380] seq_read_iter+0x10d/0x4b0 [ 594.159385] seq_read+0x9e/0xd0 [ 594.159390] full_proxy_read+0x4e/0x80 [ 594.159396] vfs_read+0xb6/0x310 [ 594.159401] ksys_read+0x60/0xe0 [ 594.159406] do_syscall_64+0x38/0x90 [ 594.159413] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 594.159419] -> #1 (&xe->mem_access.lock){+.+.}-{3:3}: [ 594.159427] xe_device_mem_access_get+0x43/0xc0 [xe] [ 594.159457] xe_gt_tlb_invalidation_vma+0x53/0x190 [xe] [ 594.159490] invalidation_fence_init+0x1d2/0x2c0 [xe] [ 594.159529] __xe_pt_unbind_vma+0x151/0x4e0 [xe] [ 594.159564] vm_bind_ioctl+0x48a/0xae0 [xe] [ 594.159602] async_op_work_func+0x20c/0x530 [xe] [ 594.159634] process_one_work+0x263/0x520 [ 594.159640] worker_thread+0x4d/0x3b0 [ 594.159646] kthread+0xeb/0x120 [ 594.159650] ret_from_fork+0x29/0x50 [ 594.159655] -> #0 (&vm->userptr.notifier_lock){++++}-{3:3}: [ 594.159663] __lock_acquire+0x16fa/0x2850 [ 594.159670] lock_acquire+0xd2/0x2e0 [ 594.159676] down_write+0x36/0xd0 [ 594.159681] vma_userptr_invalidate+0xeb/0x330 [xe] [ 594.159714] __mmu_notifier_invalidate_range_start+0x239/0x2a0 [ 594.159722] unmap_vmas+0x1ac/0x1c0 [ 594.159727] unmap_region+0xb5/0x120 [ 594.159732] do_vmi_align_munmap+0x2be/0x430 [ 594.159739] do_vmi_munmap+0xea/0x120 [ 594.159744] __vm_munmap+0x9c/0x160 [ 594.159750] __x64_sys_munmap+0x12/0x20 [ 594.159756] do_syscall_64+0x38/0x90 [ 594.159761] entry_SYSCALL_64_after_hwframe+0x72/0xdc [ 594.159768] other info that might help us debug this: [ 594.159773] Chain exists of: &vm->userptr.notifier_lock --> fs_reclaim --> mmu_notifier_invalidate_range_start [ 594.159785] Possible unsafe locking scenario: [ 594.159790] CPU0 CPU1 [ 594.159794] ---- ---- [ 594.159797] lock(mmu_notifier_invalidate_range_start); [ 594.159802] lock(fs_reclaim); [ 594.159808] lock(mmu_notifier_invalidate_range_start); [ 594.159814] lock(&vm->userptr.notifier_lock); [ 594.159819] The VM should be holding a mem_access.ref so this looks like it should be a false positive and we can just drop the explicit mem_access in xe_gt_tlb_invalidation(). The GGTT invalidation path also takes care to hold mem_access.ref so should be fine there also, and we already assert that we hold access.ref for the GuC communication underneath. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 35c8a964	10-Jul-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe: handle TLB invalidations from CT fast-path In various test cases that put the system under a heavy load, we can sometimes see errors with missed TLB invalidations. In such cases we see the interrupt arrive for the invalidation from the GuC, however the actual processing of the completion is pushed onto a workqueue and handled with all the other CT stuff, which might take longer than expected. Since we expect TLB invalidations to complete within a reasonable amount of time (at most ~250ms), and they do seem pretty critical, allow handling directly from the CT fast-path. v2 (José): - Actually use the correct spinlock/unlock_irq, since pending_lock is grabbed from IRQ. v3: - Don't publish the TLB fence on the list until after we fully initialize it and successfully do the CT send. The list is now only protected by the spin_lock pending_lock and we can't hold that across the entire TLB send operation. v4 (Matt Brost): - Be careful with racing against fast CT path writing the seqno, before we have actually published the fence. References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/297 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/320 References: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/449 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 4aa5e359	10-Jul-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe/tlb: print seqno_recv on fence TLB timeout To help debugging, sample the current seqno_recv and dump it out if we encounter a TLB timeout for the fences path. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 2ca01fe3	10-Jul-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe/tlb: also update seqno_recv during reset We might have various kworkers waiting for TLB flushes to complete which are not tracked with an explicit TLB fence, however at this stage that will never happen since the CT is already disabled, so make sure we signal them here under the assumption that we have completed a full GT reset. v2: - We need to use seqno - 1 here. After acquiring ct->lock the seqno is actually the next users seqno and not the pending one. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 4803f6e2	10-Jul-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe/tlb: increment next seqno after successful CT send If we are in the middle of a GT reset or similar the CT might be disabled, such that the CT send fails. However we already incremented gt->tlb_invalidation.seqno which might lead to warnings, since we effectively just skipped a seqno: 0000:00:02.0: drm_WARN_ON(expected_seqno != msg[0]) Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 86ed0925	10-Jul-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe/tlb: ensure we access seqno_recv once Ensure we load gt->tlb_invalidation.seqno_recv once, and use that for our seqno checking. The gt->tlb_invalidation_seqno_past is a shared global variable and can potentially change at any point here. However the checks here need to operate on a stable version of seqno_recv for this to make any sense. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 38fa29dc	10-Jul-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe/tlb: drop unnecessary smp_wmb() wake_up_all() and wait_event_timeout() already have the correct barriers as per https://www.kernel.org/doc/Documentation/memory-barriers.txt. This should ensure that the seqno_recv write can't be re-ordered wrt to the actual wake_up_all() i.e we get woken up but there is no write. The reader side with wait_event_timeout() also has the correct barriers. With that drop the hand rolled smp_wmb(), which is anyway missing some kind of matching barrier on the reader side. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 21ed3327	22-Jun-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Add helpers to hide struct xe_vma internals This will help with the GPUVA port as the internals of struct xe_vma will change. v2: Update comment around helpers Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.kernel.org> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# f79ee301	01-Jun-2023	Matt Roper <matthew.d.roper@intel.com>	drm/xe: Add backpointer from gt to tile Rather than a backpointer to the xe_device, a GT should have a backpointer to its tile (which can then be used to lookup the device if necessary). The gt_to_xe() helper macro (which moves from xe_gt.h to xe_gt_types.h) can and should still be used to jump directly from an xe_gt to xe_device. v2: - Fix kunit test build - Move a couple changes to the previous patch. (Lucas) Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://lore.kernel.org/r/20230601215244.678611-4-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 3af43650	24-May-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe: keep pulling mem_access_get further back Lockdep is unhappy about ggtt->lock -> runtime_pm, where it seems to think this can somehow get inverted. The ggtt->lock looks like a potentially sensitive driver lock, so likely a sensible move to never call the runtime_pm routines while holding it. Actually it looks like d3cold wants to grab this, so perhaps this can indeed deadlock. v2: - Don't forget about xe_gt_tlb_invalidation_vma(), which now needs explicit access_get. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 565ce72e	24-May-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe: don't allocate under ct->lock Seems to be a sensitive lock, where ct->lock looks to be primed with fs_reclaim, so holding that and then allocating memory will cause lockdep to complain. We need to change the ordering wrt to grabbing the ct->lock and potentially grabbing the runtime_pm, since some of the runtime_pm routines can allocate memory (or at least that's what lockdep seems to suggest). Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# a2db3192	05-May-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe: fix tlb_invalidation_seqno_past() Checking seqno_recv >= seqno looks like it will incorrectly report true when the seqno has wrapped (not unlikely given TLB_INVALIDATION_SEQNO_MAX). Calling xe_gt_tlb_invalidation_wait() might then return before the flush has been completed by the GuC. Fix this by treating a large negative delta as an indication that the seqno has wrapped around. Similar to how we treat a large positive delta as an indication that the seqno_recv must have wrapped around, but in that case the seqno has likely also signalled. It looks like we could also potentially make the seqno use the full 32bits as supported by the GuC. Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# a5cecbac	05-May-2023	Nirmoy Das <nirmoy.das@intel.com>	drm/xe: Print GT info on TLB inv failure Print GT info on TLB inv failure for better debugbility. Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# fa4fe0db	18-Apr-2023	Matthew Auld <matthew.auld@intel.com>	drm/xe/tlb: fix expected_seqno calculation It looks like when tlb_invalidation.seqno overflows TLB_INVALIDATION_SEQNO_MAX, we start counting again from one, as per send_tlb_invalidation(). This is also inline with initial value we give it in xe_gt_tlb_invalidation_init(). When calculating the expected_seqno we should also take this into account. While we are here also print out the values if we ever trigger the warning. v2 (José): - drm_WARN_ON() is preferred over plain WARN_ON(), since it gives information on the originating device. Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/248 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# ea9f879d	24-Feb-2023	Lucas De Marchi <lucas.demarchi@intel.com>	drm/xe: Sort includes Sort includes and split them in blocks: 1) .h corresponding to the .c. Example: xe_bb.c should have a "#include "xe_bb.h" first. 2) #include <linux/...> 3) #include <drm/...> 4) local includes 5) i915 includes This is accomplished by running `clang-format --style=file -i --sort-includes drivers/gpu/drm/xe/*.[ch]` and ignoring all the changes after the includes. There are also some manual tweaks to split the blocks. v2: Also sort includes in headers Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 9f9f09d4	26-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Signal invalidation fence immediately if CT send fails This means we are in the middle of a GT reset and no need to do TLB invalidation so just signal invalidation fence immediately. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
# da3799c9	30-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Use GuC to do GGTT invalidations for the GuC firmware Only the GuC should be issuing TLB invalidations if it is enabled. Part of this patch is sanitize the device on driver unload to ensure we do not send GuC based TLB invalidations during driver unload. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
# 332dd0116	24-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Add range based TLB invalidations If the platform supports range based TLB invalidations use them. Hide these details in the xe_gt_tlb_invalidation layer. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
# 38224c00	24-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Add TDR for invalidation fence timeout cleanup Endless fences are not good, add a TDR to cleanup any invalidation fences which have not received an invalidation message within a timeout period. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
# 24b52db6	19-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Add TLB invalidation fence ftrace This will help debug issues with TLB invalidation fences. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# c6b0948f	20-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Kernel doc GT TLB invalidations Document all exported functions. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# f4a8add9	18-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Invalidate TLB after unbind is complete This gets tricky as we can't do the TLB invalidation until the unbind operation is done on the hardware and we can't signal the unbind as complete until the TLB invalidation is done. To work around this we create an unbind fence which does a TLB invalidation after unbind is done on the hardware, signals on TLB invalidation completion, and this fence is installed in the BO dma-resv slot and installed in out-syncs for the unbind operation. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Suggested-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# fc108a8b	17-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Add TLB invalidation fence Fence will be signaled when TLB invalidation completion. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Suggested-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# 62ad0621	17-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Move TLB invalidation variable to own sub-structure in GT TLB invalidations no longer just restricted to USM, move the variables to own sub-structure. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
# a9351846	17-Jan-2023	Matthew Brost <matthew.brost@intel.com>	drm/xe: Break of TLB invalidation into its own file TLB invalidation is used by more than USM (page faults) so break this code out into its own file. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>