#
d50892a9 |
|
16-Jan-2024 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: switch from drm_debug_printer() to device specific drm_dbg_printer() Prefer the device specific debug printer. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Luca Coelho <luciano.coelho@intel.com> Acked-by: Maxime Ripard <mripard@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/f2614dfcba295be20c650cdab24c3979d265f422.1705410327.git.jani.nikula@intel.com
|
#
cf9cb028 |
|
30-Nov-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Use internal class when counting engine resets Commit 503579448db9 ("drm/i915/gsc: Mark internal GSC engine with reserved uabi class") made the GSC0 engine not have a valid uabi class and so broke the engine reset counting, which in turn was made class based in cb823ed9915b ("drm/i915/gt: Use intel_gt as the primary object for handling resets"). Despite the title and commit text of the latter is not mentioning it (and has left the storage array incorrectly sized), tracking by class, despite it adding aliasing in hypthotetical multi-tile systems, is handy for virtual engines which for instance do not have a valid engine->id. Therefore we keep that but just change it to use the internal class which is always valid. We also add a helper to increment the count, which aligns with the existing getter. What was broken without this fix were out of bounds reads every time a reset would happen on the GSC0 engine, or during selftests when storing and cross-checking the counts in igt_live_test_begin and igt_live_test_end. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: dfed6b58d54f ("drm/i915/gsc: Mark internal GSC engine with reserved uabi class") [tursulin: fixed Fixes tag] Reported-by: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231201122109.729006-2-tvrtko.ursulin@linux.intel.com
|
#
1f721a93 |
|
30-Nov-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Use internal class when counting engine resets Commit 503579448db9 ("drm/i915/gsc: Mark internal GSC engine with reserved uabi class") made the GSC0 engine not have a valid uabi class and so broke the engine reset counting, which in turn was made class based in cb823ed9915b ("drm/i915/gt: Use intel_gt as the primary object for handling resets"). Despite the title and commit text of the latter is not mentioning it (and has left the storage array incorrectly sized), tracking by class, despite it adding aliasing in hypthotetical multi-tile systems, is handy for virtual engines which for instance do not have a valid engine->id. Therefore we keep that but just change it to use the internal class which is always valid. We also add a helper to increment the count, which aligns with the existing getter. What was broken without this fix were out of bounds reads every time a reset would happen on the GSC0 engine, or during selftests when storing and cross-checking the counts in igt_live_test_begin and igt_live_test_end. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 503579448db9 ("drm/i915/gsc: Mark internal GSC engine with reserved uabi class") [tursulin: fixed Fixes tag] Reported-by: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231201122109.729006-2-tvrtko.ursulin@linux.intel.com (cherry picked from commit cf9cb028ac56696ff879af1154c4b2f0b12701fd) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
e96aef07 |
|
09-Oct-2023 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/gt: More use of GT specific print helpers A bunch of print messages got missed in the update to using sub-system specific helpers. So update those. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231009183802.673882-2-John.C.Harrison@Intel.com
|
#
1e975e59 |
|
26-Sep-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Do not disable preemption for resets Commit ade8a0f59844 ("drm/i915: Make all GPU resets atomic") added a preempt disable section over the hardware reset callback to prepare the driver for being able to reset from atomic contexts. In retrospect I can see that the work item at a time was about removing the struct mutex from the reset path. Code base also briefly entertained the idea of doing the reset under stop_machine in order to serialize userspace mmap and temporary glitch in the fence registers (see eb8d0f5af4ec ("drm/i915: Remove GPU reset dependence on struct_mutex"), but that never materialized and was soon removed in 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset") and replaced with a SRCU based solution. As such, as far as I can see, today we still have a requirement that resets must not sleep (invoked from submission tasklets), but no need to support invoking them from a truly atomic context. Given that the preemption section is problematic on RT kernels, since the uncore lock becomes a sleeping lock and so is invalid in such section, lets try and remove it. Potential downside is that our short waits on GPU to complete the reset may get extended if CPU scheduling interferes, but in practice that probably isn't a deal breaker. In terms of mechanics, since the preemption disabled block is being removed we just need to replace a few of the wait_for_atomic macros into busy looping versions which will work (and not complain) when called from non-atomic sections. v2: * Fix timeouts which are now in us. (Andi) * Update one comment as a drive by. (Andi) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris.p.wilson@intel.com> Cc: Paul Gortmaker <paul.gortmaker@windriver.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230926100855.61722-1-tvrtko.ursulin@linux.intel.com
|
#
14128d64 |
|
21-Aug-2023 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Replace several IS_METEORLAKE with proper IP version checks Many of the IS_METEORLAKE conditions throughout the driver are supposed to be checks for Xe_LPG and/or Xe_LPM+ IP, not for the MTL platform specifically. Update those checks to ensure that the code will still operate properly if/when these IP versions show up on future platforms. v2: - Update two more conditions (one for pg_enable, one for MTL HuC compatibility). v3: - Don't change GuC/HuC compatibility check, which sounds like it truly is specific to the MTL platform. (Gustavo) - Drop a non-lineage workaround number for the OA timestamp frequency workaround. (Gustavo) Cc: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-20-matthew.d.roper@intel.com
|
#
5a213086 |
|
21-Aug-2023 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Eliminate IS_MTL_GRAPHICS_STEP Several workarounds are guarded by IS_MTL_GRAPHICS_STEP. However none of these workarounds are actually tied to MTL as a platform; they only relate to the Xe_LPG graphics IP, regardless of what platform it appears in. At the moment MTL is the only platform that uses Xe_LPG with IP versions 12.70 and 12.71, but we can't count on this being true in the future. Switch these to use a new IS_GFX_GT_IP_STEP() macro instead that is purely based on IP version. IS_GFX_GT_IP_STEP() is also GT-based rather than device-based, which will help prevent mistakes where we accidentally try to apply Xe_LPG graphics workarounds to the Xe_LPM+ media GT and vice-versa. v2: - Switch to a more generic and shorter IS_GT_IP_STEP macro that can be used for both graphics and media IP (and any other kind of GTs that show up in the future). v3: - Switch back to long-form IS_GFX_GT_IP_STEP macro. (Jani) - Move macro to intel_gt.h. (Andi) v4: - Build IS_GFX_GT_IP_STEP on top of IS_GFX_GT_IP_RANGE and IS_GRAPHICS_STEP building blocks and name the parameters from/until rather than begin/fixed. (Jani) - Fix usage examples in comment. v5: - Tweak comment on macro. (Gustavo) Cc: Gustavo Sousa <gustavo.sousa@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-15-matthew.d.roper@intel.com
|
#
28c46fee |
|
21-Aug-2023 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Consolidate condition for Wa_22011802037 The workaround bounds for Wa_22011802037 are somewhat complex and are replicated in several places throughout the code. Pull the condition out to a helper function to prevent mistakes if this condition needs to change again in the future. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-12-matthew.d.roper@intel.com
|
#
ed6dd32c |
|
05-Jul-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Remove some dead "code" Commit 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset") removed the temporary implementation of a reset under stop machine but forgot to remove this one commented out define. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230705095518.3690951-1-tvrtko.ursulin@linux.intel.com
|
#
848a4e5c |
|
08-Jun-2023 |
Luca Coelho <luciano.coelho@intel.com> |
drm/i915: add a dedicated workqueue inside drm_i915_private In order to avoid flush_scheduled_work() usage, add a dedicated workqueue in the drm_i915_private structure. In this way, we don't need to use the system queue anymore. This change is mostly mechanical and based on Tetsuo's original patch[1]. v6 by Jani: - Also create unordered_wq for mock device Link: https://patchwork.freedesktop.org/series/114608/ [1] Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c816ebe17ef08d363981942a096a586a7658a65e.1686231190.git.jani.nikula@intel.com
|
#
476f62b8 |
|
18-Apr-2023 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: use explicit includes for i915_reg.h and i915_irq.h A lot of places include i915_reg.h implicitly via i915_irq.h, which gets included implicitly via intel_display_trace.h. Remove the includes from the headers, and include i915_reg.h and i915_irq.h explicitly where needed. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230419094243.366821-1-jani.nikula@intel.com
|
#
59c6106e |
|
13-Apr-2023 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/display: add intel_display_reset.[ch] Split out the display reset functionality to a separate file to declutter intel_display.c. Rename the functions accordingly. The minor downside is having to expose __intel_display_resume(). Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/5e98e2fc5f0c09490e02d22250c8201342852288.1681465222.git.jani.nikula@intel.com
|
#
b7d70b8b |
|
23-Mar-2023 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915/gsc: implement wa 14015076503 The WA states that we need to alert the GSC FW before doing a GSC engine reset and then wait for 200ms. The GuC owns engine reset, so on the i915 side we only need to apply this for full GT reset. Given that we do full GT resets in the resume paths to cleanup the HW state and that a long wait in those scenarios would not be acceptable, a faster path has been introduced where, if the GSC is idle, we try first to individually reset the GuC and all engines except the GSC and only fall back to full reset if that fails. Note: according to the WA specs, if the GSC is idle it should be possible to only wait for the uC wakeup time (~15ms) instead of the whole 200ms. However, the GSC FW team have mentioned that the wakeup time can change based on other things going on in the HW and pcode, so a good security margin would be required. Given that when the GSC is idle we already skip the wait & reset entirely and that this reduced wait would still likely be too long to use in resume paths, it's not worth adding support for this reduced wait. v2: add comment to explain why it is safe to skip the GSC reset (John) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323231857.2194435-2-daniele.ceraolospurio@intel.com
|
#
625af472 |
|
23-Mar-2023 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: limit double GT reset to pre-MTL Commit 3db9d590557d ("drm/i915/gt: Reset twice") modified the code to always hit the GDRST register twice when doing a reset, with the reported aim to fix invalid post-reset engine state on some platforms (Jasperlake being the only one actually mentioned). This is a problem on MTL, due to the fact that we have to apply a time consuming WA (coming in the next patch) every time we hit the GDRST register in a way that can include the GSC engine. Even post MTL, the expectation is that we'll have some work to do before and after hitting the GDRST if the GSC is involved. Since the issue requiring the double reset seems to be limited to older platforms, instead of trying to handle the double-reset on MTL and future platforms it is just easier to turn it off. The default on MTL is also for GuC to own engine reset, with i915 only covering full-GT reset. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323231857.2194435-1-daniele.ceraolospurio@intel.com
|
#
3db9d590 |
|
12-Dec-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Reset twice After applying an engine reset, on some platforms like Jasperlake, we occasionally detect that the engine state is not cleared until shortly after the resume. As we try to resume the engine with volatile internal state, the first request fails with a spurious CS event (it looks like it reports a lite-restore to the hung context, instead of the expected idle->active context switch). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212161338.1007659-1-andi.shyti@linux.intel.com
|
#
4050e6f2 |
|
23-Nov-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/gt: remove some limited use register access wrappers Remove rmw_set(), rmw_clear(), clear_register(), rmw_set_fw(), and rmw_clear_fw(). They're just one too many levels of abstraction for register access, for very specific purposes. clear_register() seems like a micro-optimization bypassing the write when the register is already clear, but that trick has ceased to work since commit 06b975d58fd6 ("drm/i915: make intel_uncore_rmw() write unconditionally"). Just clear the register in the most obvious way. Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221123164916.4128733-1-jani.nikula@intel.com
|
#
d3de5616 |
|
12-Dec-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Reset twice After applying an engine reset, on some platforms like Jasperlake, we occasionally detect that the engine state is not cleared until shortly after the resume. As we try to resume the engine with volatile internal state, the first request fails with a spurious CS event (it looks like it reports a lite-restore to the hung context, instead of the expected idle->active context switch). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: stable@vger.kernel.org Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221212161338.1007659-1-andi.shyti@linux.intel.com (cherry picked from commit 3db9d590557da3aa2c952f2fecd3e9b703dad790) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
178b8a36 |
|
02-Nov-2022 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/guc: Don't deadlock busyness stats vs reset The engine busyness stats has a worker function to do things like 64bit extend the 32bit hardware counters. The GuC's reset prepare function flushes out this worker function to ensure no corruption happens during the reset. Unforunately, the worker function has an infinite wait for active resets to finish before doing its work. Thus a deadlock would occur if the worker function had actually started just as the reset starts. The function being used to lock the reset-in-progress mutex is called intel_gt_reset_trylock(). However, as noted it does not follow standard 'trylock' conventions and exit if already locked. So rename the current _trylock function to intel_gt_reset_lock_interruptible(), which is the behaviour it actually provides. In addition, add a new implementation of _trylock and call that from the busyness stats worker instead. v2: Rename existing trylock to interruptible rather than trying to preserve the existing (confusing) naming scheme (review comments from Tvrtko). Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102192109.2492625-3-John.C.Harrison@Intel.com
|
#
d24e7855 |
|
16-Sep-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Bump the reset-failure timeout to 60s If attempting to perform a GT reset takes long than 5 seconds (including resetting the display for gen3/4), then we declare all hope lost and discard all user work and wedge the device to prevent further misbehaviour. 5 seconds is too short a time for such drastic action, as we may be stuck on other timeouts and watchdogs. If we allow a little bit longer before hitting the big red button, we should at the very least capture other hung task indicators pointing towards the reason why the reset was hanging; and allow more marginal cases the extra headroom to complete the reset without further collateral damage. Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/6448 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220916204823.1897089-1-ashutosh.dixit@intel.com
|
#
3bb6a442 |
|
01-Sep-2022 |
Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> |
drm/i915: Rename ggtt_view as gtt_view So far, different views (normal, partial, rotated and remapped) into the same object are only supported for GGTT mappings. But with the upcoming VM_BIND feature, PPGTT will also use the partial view mapping. Hence rename ggtt_view to more generic gtt_view. Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220901183854.3446-1-niranjana.vishwanathapura@intel.com
|
#
1dab4561 |
|
29-Jun-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/reset: Handle reset timeouts under unrelated kernel hangs When resuming after hibernate sometimes we see hangs in unrelated kernel subsystems. These hangs often result in the following i915 trace: i915 0000:00:02.0: [drm] *ERROR* \ intel_gt_reset_global timed out, cancelling all in-flight rendering implying our reset task has been starved by the hanging kernel subsystem, causing us to inappropiately declare the system as wedged beyond recovery. The trace would be caused by our synchronize_srcu_expedited() taking more than the allowed 5s due to the unrelated kernel hang. But we neither need to perform that synchronisation inside the reset watchdog, nor do we need such a short timeout before declaring the device as unrecoverable. v2: Restore watchdog timeout to the previous 5 seconds (Ashutosh) Bug: https://gitlab.freedesktop.org/drm/intel/-/issues/3575 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220630043959.5708-1-ashutosh.dixit@intel.com
|
#
336561a9 |
|
12-Jul-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Serialize GRDOM access between multiple engine resets Don't allow two engines to be reset in parallel, as they would both try to select a reset bit (and send requests to common registers) and wait on that register, at the same time. Serialize control of the reset requests/acks using the uncore->lock, which will also ensure that no other GT state changes at the same time as the actual reset. Cc: stable@vger.kernel.org # v4.4 and upper Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7dbac3011729.1657639152.git.mchehab@kernel.org
|
#
b409db08 |
|
19-May-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
Revert "drm/i915: Drop has_reset_engine from device info" This reverts commit 922abe4d19bd21b38298f3902674774b92a49293. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220519090802.1294691-6-tvrtko.ursulin@linux.intel.com
|
#
303760aa |
|
25-Apr-2022 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
i915/guc/reset: Make __guc_reset_context aware of guilty engines There are 2 ways an engine can get reset in i915 and the method of reset affects how KMD labels a context as guilty/innocent. (1) GuC initiated engine-reset: GuC resets a hung engine and notifies KMD. The context that hung on the engine is marked guilty and all other contexts are innocent. The innocent contexts are resubmitted. (2) GT based reset: When an engine heartbeat fails to tick, KMD initiates a gt/chip reset. All active contexts are marked as guilty and discarded. In order to correctly mark the contexts as guilty/innocent, pass a mask of engines that were reset to __guc_reset_context. Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220426003045.3929439-1-umesh.nerlige.ramappa@intel.com
|
#
922abe4d |
|
05-May-2022 |
José Roberto de Souza <jose.souza@intel.com> |
drm/i915: Drop has_reset_engine from device info No need to have this parameter in intel_device_info struct as all platforms with graphics version 7 or newer can reset engines. As a side effect of the of removal this flag, it will not be printed in dmesg during driver load anymore and developers will have to rely on to check the macro and compare with platform being used and IP versions of it. Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220505193524.276400-3-jose.souza@intel.com
|
#
b24dcf1d |
|
12-Jul-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Serialize GRDOM access between multiple engine resets Don't allow two engines to be reset in parallel, as they would both try to select a reset bit (and send requests to common registers) and wait on that register, at the same time. Serialize control of the reset requests/acks using the uncore->lock, which will also ensure that no other GT state changes at the same time as the actual reset. Cc: stable@vger.kernel.org # v4.4 and upper Reported-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e0a2d894e77aed7c2e36b0d1abdc7dbac3011729.1657639152.git.mchehab@kernel.org (cherry picked from commit 336561a914fc0c6f1218228718f633b31b7af1c3) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
dac38381 |
|
15-Apr-2022 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
drm/i915/guc: Enable Wa_22011802037 for gen12 GuC based platforms Initiating a reset when the command streamer is not idle or in the middle of executing an MI_FORCE_WAKE can result in a hang. Multiple command streamers can be part of a single reset domain, so resetting one would mean resetting all command streamers in that domain. To workaround this, before initiating a reset, ensure that all command streamers within that reset domain are either IDLE or are not executing a MI_FORCE_WAKE. Enable GuC PRE_PARSER WA bit so that GuC follows the WA sequence when initiating engine-resets. For gt-resets, ensure that i915 applies the WA sequence. Opens to address in future patches: - The part of the WA to wait for pending forcewakes is also applicable to execlists backend. - The WA also needs to be applied for gen11 Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220415224025.3693037-3-umesh.nerlige.ramappa@intel.com
|
#
a6f0f9cf |
|
21-Mar-2022 |
Alan Previn <alan.previn.teres.alexis@intel.com> |
drm/i915/guc: Plumb GuC-capture into gpu_coredump Add a flags parameter through all of the coredump creation functions. Add a bitmask flag to indicate if the top level gpu_coredump event is triggered in response to a GuC context reset notification. Using that flag, ensure all coredump functions that read or print mmio-register values related to work submission or command-streamer engines are skipped and replaced with a calls guc-capture module equivalent functions to retrieve or print the register dump. While here, split out display related register reading and printing into its own function that is called agnostic to whether GuC had triggered the reset. For now, introduce an empty printing function that can filled in on a subsequent patch just to handle formatting. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-13-alan.previn.teres.alexis@intel.com
|
#
01fabda8 |
|
25-Feb-2022 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: Use str_yes_no() Remove the local yesno() implementation and adopt the str_yes_no() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220225234631.3725943-1-lucas.demarchi@intel.com
|
#
89e96d82 |
|
25-Apr-2022 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
i915/guc/reset: Make __guc_reset_context aware of guilty engines There are 2 ways an engine can get reset in i915 and the method of reset affects how KMD labels a context as guilty/innocent. (1) GuC initiated engine-reset: GuC resets a hung engine and notifies KMD. The context that hung on the engine is marked guilty and all other contexts are innocent. The innocent contexts are resubmitted. (2) GT based reset: When an engine heartbeat fails to tick, KMD initiates a gt/chip reset. All active contexts are marked as guilty and discarded. In order to correctly mark the contexts as guilty/innocent, pass a mask of engines that were reset to __guc_reset_context. Fixes: eb5e7da736f3 ("drm/i915/guc: Reset implementation for new GuC interface") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220426003045.3929439-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 303760aa914b7f5ac9602dbb4b471a2ad52eeb3e) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
e30e6c7b |
|
14-Feb-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Move MCHBAR registers to their own header Registers that exist within the MCH BAR and are mirrored into the GPU's MMIO space are a good candidate to separate out into their own header. For reference, the mirror of the MCH BAR starts at the following locations in the graphics MMIO space (the end of the MCHBAR range differs slightly on each platform): * Pre-gen6: 0x10000 * Gen6-Gen11 + RKL: 0x140000 v2: - Create separate patch to swtich a few register definitions to be relative to the MCHBAR mirror base. - Drop upper bound of MCHBAR mirror from commit message; there are too many different combinations between various platforms to list out, and the documentation is spotty for the older pre-gen6 platforms anyway. Bspec: 134, 51771 Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220215061342.2055952-2-matthew.d.roper@intel.com
|
#
93cc7aa0 |
|
08-Feb-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/gt: Move SFC lock bits to intel_engine_regs.h These SFC registers were defined in an unusual way, taking an engine as a parameter rather than an engine MMIO base offset. Let's adjust them to match the style used by other per-engine registers and move them to intel_engine_regs.h. While doing this move, we can drop GEN12_HCP_SFC_FORCED_LOCK completely; it was intended for use in an early version of a hardware workaround, but was no longer necessary by the time the workaround was finalized. It is not used anywhere in the driver. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220209051140.1599643-3-matthew.d.roper@intel.com
|
#
5472b3f2 |
|
10-Feb-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: split out i915_file_private.h from i915_drv.h Limit the scope of struct drm_i915_file_private to the files that actually need it. Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e375859dc1729a1b988036e4103e5b1bd48caa00.1644507885.git.jani.nikula@intel.com
|
#
154cfae6 |
|
28-Jan-2022 |
Bruce Chang <yu.bruce.chang@intel.com> |
drm/i915/dg2: Add Wa_22011100796 Whenever Full soft reset is required, reset all individual engines first, and then do a full soft reset. Signed-off-by: Bruce Chang <yu.bruce.chang@intel.com> cc: Matt Roper <matthew.d.roper@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220128185209.18077-5-ramalingam.c@intel.com
|
#
0d6419e9 |
|
27-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Move GT registers to their own header file This is a huge, chaotic mass of registers copied over as-is without any real cleanup. We'll come back and organize these better, align on consistent coding style, remove dead code, etc. in separate patches later that will be easier to review. v2: - Add missing include in intel_pxp_irq.c v3: - Correct a few indentation errors (Lucas) - Minor conflict resolution Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com
|
#
202b1f4c |
|
10-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/gt: Move engine registers to their own header Let's continue breaking up and cleaning up the massive i915_reg.h file by moving all registers that are defined in relation to an engine base to their own header. There are probably a bunch of other "engine registers" that we haven't moved yet (especially those that belong to the render engine in the 0x2??? range), but this is a relatively straightforward first step. Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com
|
#
7e470f10 |
|
10-Jan-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: split out PCI config space registers from i915_reg.h The PCI config space registers don't really belong next to the MMIO register definitions. v2: Fix copyright year (Matt) Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220110095740.166078-1-jani.nikula@intel.com
|
#
35291c9c |
|
10-Dec-2021 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/reset: include intel_display.h instead of intel_display_types.h Use the more specific include that's needed. v2: Include intel_display.h (Ville) Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/e1e195b8ca1252d1e149c8185d107a5517496973.1639142167.git.jani.nikula@intel.com
|
#
20cddfcc |
|
06-Dec-2021 |
Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> |
drm/i915/gt: Use hw_engine_masks as reset_domains We need a way to reset engines by their reset domains. This change sets up way to fetch reset domains of each engine globally. Changes since V1: - Use static reset domain array - Ville and Tvrtko - Use BUG_ON at appropriate place - Tvrtko Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211206081026.4024401-1-tejaskumarx.surendrakumar.upadhyay@intel.com
|
#
9a7fc952 |
|
11-Nov-2021 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Skip error capture when wedged on init Trying to capture uninitialised engines when we wedged on init ends in tears. Skip that together with uC capture, since failure to initialise the latter can actually be one of the reasons for wedging on init. v2: * Use i915_disable_error_state when wedging on init/fini. v3: * Handle mock tests. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> # v1 Link: https://patchwork.freedesktop.org/patch/msgid/20211111130634.266098-1-tvrtko.ursulin@linux.intel.com
|
#
03f060b7 |
|
28-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/resets: Don't set / test for per-engine reset bits with GuC submission Don't set, test for, or clear per-engine reset bits with GuC submission as the GuC owns the per engine resets not the i915. Setting, testing for, and clearing these bits is causing issues with the hangcheck selftest. Rather than change to test to not use these bits, rip the use of these bits out from the reset code. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211028224224.32693-1-matthew.brost@intel.com
|
#
ae8ac10d |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Implement banned contexts for GuC submission When using GuC submission, if a context gets banned disable scheduling and mark all inflight requests as complete. Cc: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-25-matthew.brost@intel.com
|
#
dc0dad36 |
|
26-Jul-2021 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/guc: Fix for error capture after full GPU reset with GuC In the case of a full GPU reset (e.g. because GuC has died or because GuC's hang detection has been disabled), the driver can't rely on GuC reporting the guilty context. Instead, the driver needs to scan all active contexts and find one that is currently executing, as per the execlist mode behaviour. In GuC mode, this scan is different to execlist mode as the active request list is handled very differently. Similarly, the request state dump in debugfs needs to be handled differently when in GuC submission mode. Also refactured some of the request scanning code to avoid duplication across the multiple code paths that are now replicating it. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-20-matthew.brost@intel.com
|
#
eb5e7da7 |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Reset implementation for new GuC interface Reset implementation for new GuC interface. This is the legacy reset implementation which is called when the i915 owns the engine hang check. Future patches will offload the engine hang check to GuC but we will continue to maintain this legacy path as a fallback and this code path is also required if the GuC dies. With the new GuC interface it is not possible to reset individual engines - it is only possible to reset the GPU entirely. This patch forces an entire chip reset if any engine hangs. v2: (Michal) - Check for -EPIPE rather than -EIO (CT deadlock/corrupt check) v3: (John H) - Split into a series of smaller patches v4: (John H) - Fix typo - Add braces around if statements in reset code v5: (Checkpatch) - Fix warnings Cc: John Harrison <john.c.harrison@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <john.c.harrison@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-9-matthew.brost@intel.com
|
#
ddabf721 |
|
23-Jul-2021 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/xehp: Extra media engines - Part 3 (reset) Xe_HP can have a lot of extra media engines. This patch adds the reset support for them. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210723174239.1551352-5-matthew.d.roper@intel.com
|
#
c816723b |
|
05-Jun-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com
|
#
5b26d57f |
|
26-May-2021 |
Aditya Swarup <aditya.swarup@intel.com> |
drm/i915: Add Wa_14010733141 The WA requires the following procedure for VDBox SFC reset: If (MFX-SFC usage is 1) { 1.Issue a MFX-SFC forced lock 2.Wait for MFX-SFC forced lock ack 3.Check the MFX-SFC usage bit If (MFX-SFC usage bit is 1) Reset VDBOX and SFC else Reset VDBOX Release the force lock MFX-SFC } else if(HCP+SFC usage is 1) { 1.Issue a VE-SFC forced lock 2.Wait for SFC forced lock ack 3.Check the VE-SFC usage bit If (VE-SFC usage bit is 1) Reset VDBOX else Reset VDBOX and SFC Release the force lock VE-SFC. } else Reset VDBOX - Restructure: the changes to the original code flow should stay relatively minimal; we only need to do an extra HCP check after the usual VD-MFX check and, if true, switch the register/bit we're performing the lock on.(MattR) v2: - Assign unlock mask using paired_engine->mask instead of using BIT(paired_vecs->id). (Daniele) Bspec: 52890, 53509 Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Aditya Swarup <aditya.swarup@intel.com> Co-developed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210526094852.286424-2-aditya.swarup@intel.com
|
#
c92c36ed |
|
21-May-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move submission_method into intel_gt Since we setup the submission method for the engines once, it is easy to assign an enum and use that instead of probing into the backends. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210521183215.65451-3-matthew.brost@intel.com
|
#
c10e4a79 |
|
01-Feb-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Protect against request freeing during cancellation on wedging As soon as we mark a request as completed, it may be retired. So when cancelling a request and marking it complete, make sure we first keep a reference to the request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210201085715.27435-4-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
24f90d66 |
|
22-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: SPDX cleanup Clean up the SPDX licence declarations to comply with checkpatch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-1-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
5b0a78ec |
|
23-Mar-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Move gt_revoke() slightly We get a lockdep splat when the reset mutex is held, because it can be taken from fence_wait. This conflicts with the mmu notifier we have, because we recurse between reset mutex and mmap lock -> mmu notifier. Remove this recursion by calling revoke_mmaps before taking the lock. The reset code still needs fixing, as taking mmap locks during reset is not allowed. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> [danvet: Add FIXME.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-64-maarten.lankhorst@linux.intel.com
|
#
e322551f |
|
28-Jan-2021 |
Thomas Zimmermann <tzimmermann@suse.de> |
drm/i915/gt: Remove references to struct drm_device.pdev Using struct drm_device.pdev is deprecated. Convert i915 to struct drm_device.dev. No functional changes. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-3-tzimmermann@suse.de
|
#
163433e5 |
|
14-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Mark up protected uses of 'i915_request_completed' When we know that we are inside the timeline mutex, or inside the submission flow (under active.lock or the holder's rcu lock), we know that the rq->hwsp is stable and we can use the simpler direct version. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-1-chris@chris-wilson.co.uk
|
#
9834dfef |
|
13-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Prune inlines Remove all the manual inlines from non-critical sections in gt/ add/remove: 2/0 grow/shrink: 0/3 up/down: 762/-1473 (-711) Function old new delta mi_set_context.isra - 602 +602 write_dma_entry - 160 +160 __set_pd_entry 214 69 -145 clear_pd_entry 190 42 -148 ring_request_alloc 2021 841 -1180 Total: Before=1605086, After=1604375, chg -0.04% Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210113152224.29794-1-chris@chris-wilson.co.uk
|
#
0a7d355e |
|
04-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Allow failed resets without assertion If the engine reset fails, we will attempt to resume with the current inflight submissions. When that happens, we cannot assert that the engine reset cleared the pending submission, so do not. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2878 Fixes: 16f2941ad307 ("drm/i915/gt: Replace direct submit with direct call to tasklet") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-3-chris@chris-wilson.co.uk
|
#
cecb2af4 |
|
29-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Taint the reset mutex with the shrinker Declare that, under extreme circumstances, the shrinker may need to wait upon a request, in which case reset must not itself deadlock in order to ensure forward progress of the driver. That is since the shrinker may depend upon a reset, any reset cannot touch the shrinker. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201229141626.4773-1-chris@chris-wilson.co.uk
|
#
16f2941a |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Replace direct submit with direct call to tasklet Rather than having special case code for opportunistically calling process_csb() and performing a direct submit while holding the engine spinlock for submitting the request, simply call the tasklet directly. This allows us to retain the direct submission path, including the CS draining to allow fast/immediate submissions, without requiring any duplicated code paths, and most importantly greatly simplifying the control flow by removing reentrancy. This will enable us to close a few races in the virtual engines in the next few patches. The trickiest part here is to ensure that paired operations (such as schedule_in/schedule_out) remain under consistent locking domains, e.g. when pulled outside of the engine->active.lock v2: Use bh kicking, see commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous"). v3: Update engine-reset to be tasklet aware Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
|
#
cb56a07d |
|
04-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Include reset failures in the trace The GT and engine reset failures are completely invisible when looking at a trace for a bug, but are vital to understanding the incomplete flow. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201204151234.19729-3-chris@chris-wilson.co.uk
|
#
e669ad6f |
|
06-Nov-2020 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/display: add namespace to intel_finish_reset Rename intel_finish_reset to intel_display_finish_reset, so it's clear from gt/ that we are calling out the display code. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-2-lucas.demarchi@intel.com
|
#
87ebfaab |
|
06-Nov-2020 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/display: add namespace to intel_prepare_reset Rename intel_prepare_reset to intel_display_prepare_reset, so it's clear from gt/ that we are calling out the display code. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201106225531.920641-1-lucas.demarchi@intel.com
|
#
bda30024 |
|
04-Nov-2020 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Improve record of hung engines in error state Between events which trigger engine and GPU resets and capturing the error state we lose information on which engine triggered the reset. Improve this by passing in the hung engine mask down to error capture. Result is that the list of engines in user visible "GPU HANG: ecode <gen>:<engines>:<ecode>, <process>" is now a list of hanging and not just active engines. Most importantly the displayed process is now the one which was actually hung. v2: * Stub prototype. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201104134743.916027-1-tvrtko.ursulin@linux.intel.com
|
#
b0573472 |
|
30-Sep-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Retire cancelled requests on unload If we manage to hit the intel_gt_set_wedged_on_fini() while active, i.e. module unload during a stress test, we may cancel the requests but not clean up. This leads to a very slow module unload as we wait for something or other to trigger the retirement flushing, or timeout and unload with a bunch of warnings. Instead if we explicitly cancel then cleanup on an active unload, it should be instant and quiet. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200930163253.2789-3-chris@chris-wilson.co.uk
|
#
b3786b29 |
|
31-Jul-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs On the virtual engines, we only use the intel_breadcrumbs for tracking signaling of stale breadcrumbs from the irq_workers. They do not have any associated interrupt handling, active requests are passed to a physical engine and associated breadcrumb interrupt handler. This causes issues for us as we need to ensure that we do not actually try and enable interrupts and the powermanagement required for them on the virtual engine, as they will never be disabled. Instead, let's specify the physical engine used for interrupt handler on a particular breadcrumb. v2: Drop b->irq_armed = true mocking for no interrupt HW Fixes: 4fe6abb8f513 ("drm/i915/gt: Ignore irq enabling on the virtual engines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
792592e7 |
|
07-Jul-2020 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: Move the engine mask to intel_gt_info Since the engines belong to the GT, move the runtime-updated list of available engines to the intel_gt struct. The original mask has been renamed to indicate it contains the maximum engine list that can be found on a matching device. In preparation for other info being moved to the gt in follow up patches (sseu), introduce an intel_gt_info structure to group all gt-related runtime info. v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com
|
#
65706203 |
|
06-Jul-2020 |
Michał Winiarski <michal.winiarski@intel.com> |
drm/i915: Print caller when tainting for CI We can add taint from multiple places, printing the caller allows us to have a better overview of what exactly caused us to do the tainting. v2: Tweak format and print the device (Chris) v3: Move things around (Chris) Suggested-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200706144107.204821-2-michal@hardline.pl
|
#
3f04bdce |
|
06-Jul-2020 |
Michał Winiarski <michal.winiarski@intel.com> |
drm/i915: Reboot CI if we get wedged during driver init Getting wedged device on driver init is pretty much unrecoverable. Since we're running various scenarios that may potentially hit this in CI (module reload / selftests / hotunplug), and if it happens, it means that we can't trust any subsequent CI results, we should just apply the taint to let the CI know that it should reboot (CI checks taint between test runs). v2: Comment that WEDGED_ON_INIT is non-recoverable, distinguish WEDGED_ON_INIT from WEDGED_ON_FINI (Chris) v3: Appease checkpatch, fixup search-replace logic expression mindbomb in assert (Chris) Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200706144107.204821-1-michal@hardline.pl
|
#
8a25c4be |
|
18-Jun-2020 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/params: switch to device specific parameters Start using device specific parameters instead of module parameters for most things. The module parameters become the immutable initial values for i915 parameters. The device specific parameters in i915->params start life as a copy of i915_modparams. Any later changes are only reflected in the debugfs. The stragglers are: * i915.force_probe and i915.modeset. Needed before dev_priv is available. This is fine because the parameters are read-only and never modified. * i915.verbose_state_checks. Passing dev_priv to I915_STATE_WARN and I915_STATE_WARN_ON would result in massive and ugly churn. This is handled by not exposing the parameter via debugfs, and leaving the parameter writable in sysfs. This may be fixed up in follow-up work. * i915.inject_probe_failure. Only makes sense in terms of the module, not the device. This is handled by not exposing the parameter via debugfs. v2: Fix uc i915 lookup code (Michał Winiarski) Cc: Juha-Pekka Heikkilä <juha-pekka.heikkila@intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Cc: Michał Winiarski <michal.winiarski@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Michał Winiarski <michal.winiarski@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20200618150402.14022-1-jani.nikula@intel.com
|
#
dc483ba5 |
|
02-Apr-2020 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/gt: prefer struct drm_device based logging Prefer struct drm_device based logging over struct device based logging. No functional changes. Cc: Wambui Karuga <wambui.karugax@gmail.com> Reviewed-by: Wambui Karuga <wambui.karugax@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-16-jani.nikula@intel.com
|
#
a24c57d0 |
|
19-Mar-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Cancel a hung context if already closed Use the restored ability to check if a context is closed to decide whether or not to immediately ban the context from further execution after a hang. Fixes: be90e344836a ("drm/i915/gt: Cancel banned contexts after GT reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-2-chris@chris-wilson.co.uk (cherry picked from commit 8e37d699139128139c0468e005c2f0d6215b0c55) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
8e37d699 |
|
19-Mar-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Cancel a hung context if already closed Use the restored ability to check if a context is closed to decide whether or not to immediately ban the context from further execution after a hang. Fixes: be90e344836a ("drm/i915/gt: Cancel banned contexts after GT reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200319170707.8262-2-chris@chris-wilson.co.uk
|
#
f899f786 |
|
16-Mar-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move GGTT fence registers under gt/ Since the fence registers control HW detiling through the GGTT aperture, make them a part of the intel_ggtt under gt/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200316113846.4974-1-chris@chris-wilson.co.uk
|
#
be90e344 |
|
04-Mar-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Cancel banned contexts after GT reset As we started marking the ce->gem_context as NULL on closure, we can no longer use that to carry closure information. Instead, we can look at whether the context was killed on closure instead. Fixes: 130a95e9098e ("drm/i915/gem: Consolidate ctx->engines[] release") Closes: https://gitlab.freedesktop.org/drm/intel/issues/1379 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200304165113.2449213-1-chris@chris-wilson.co.uk
|
#
36e191f0 |
|
03-Mar-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Apply i915_request_skip() on submission Trying to use i915_request_skip() prior to i915_request_add() causes us to try and fill the ring upto request->postfix, which has not yet been set, and so may cause us to memset() past the end of the ring. Instead of skipping the request immediately, just flag the error on the request (only accepting the first fatal error we see) and then clear the request upon submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200304121849.2448028-1-chris@chris-wilson.co.uk
|
#
6e17ae73 |
|
07-Feb-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Only ignore already reset requests If a request is being re-run after an innocent reset, it is marked as -EAGAIN. So only skip an engine reset if the request is marked as -EIO. Testcase: igt/gem_ctx_exec/basic-nohangcheck Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200207161602.2838218-1-chris@chris-wilson.co.uk
|
#
faea1792 |
|
31-Jan-2020 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: extract engine WA programming to common resume function The workarounds are a common "feature" across gens and submission mechanisms and we already call the other WA related functions from common engine ones (<setup/cleanup>_common), so it makes sense to do the same with WA application. Medium-term, This will help us reduce the duplication once the GuC resume function is added, but short term it will also allow us to use the workaround lists for pre-gen8 engine workarounds. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200131075716.2212299-2-chris@chris-wilson.co.uk
|
#
f8474622 |
|
28-Jan-2020 |
Wambui Karuga <wambui.karugax@gmail.com> |
drm/i915/reset: conversion to new drm logging macros in gt/intel_reset.c This converts most instances of the printk based drm logging macros in i915/gt/intel_resect.c to the new struct drm_based logging macros. Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200128071437.9284-4-wambui.karugax@gmail.com
|
#
a2847782 |
|
27-Jan-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Lift set-wedged engine dumping out of user paths The user (e.g. gem_eio) can manipulate the driver into wedging itself, allowing the user to trigger voluminous logging of inconsequential details. If we lift the dump to direct calls to intel_gt_set_wedged(), out of the intel_reset failure handling, we keep the detail logging for what we expect are true HW or test failures without being tricked. Reported-by: Tomi Sarvela <tomi.p.sarvela@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tomi Sarvela <tomi.p.sarvela@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200127231540.3302516-6-chris@chris-wilson.co.uk
|
#
742379c0 |
|
09-Jan-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Start chopping up the GPU error capture In the near future, we will want to start a GPU error capture from a new context, from inside the softirq region of a forced preemption. To do so requires us to break up the monolithic error capture to provide new entry points with finer control; in particular focusing on one engine/gt, and being able to compose an error state from little pieces of HW capture. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200110123059.1348712-1-chris@chris-wilson.co.uk
|
#
3fbbbef4 |
|
06-Jan-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Convert the final GEM_TRACE to GT_TRACE and co Convert the few remaining GEM_TRACE() used for debugging over to the appropriate GT_TRACE or RQ_TRACE. References: 639f2f24895f ("drm/i915: Introduce new macros for tracing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200106114234.2529613-4-chris@chris-wilson.co.uk
|
#
45b152f7 |
|
29-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Avoid using the GPU before initialisation Mark the GT as wedged so that we are not tempted to use it prior to initialisation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191229183153.3719869-3-chris@chris-wilson.co.uk
|
#
9eae5e27 |
|
24-Dec-2019 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: prefer 3-letter acronym for ironlake We are currently using a mix of platform name and acronym to name the functions. Let's prefer the acronym as it should be clear what platform it's about and it's shorter, so it doesn't go over 80 columns in a few cases. This converts ironlake to ilk where appropriate. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Jani Nikula <jani.nikula@linux.intel.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191224084012.24241-7-lucas.demarchi@intel.com
|
#
b761a7b4 |
|
26-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Ignore incomplete engines after init failure Do not expose incomplete engines to the user after we fail to setup the GT. Fixes: e6ba76480299 ("drm/i915: Remove i915->kernel_context") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191226191237.2654510-1-chris@chris-wilson.co.uk
|
#
6a8679c0 |
|
22-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Mark the GEM context link as RCU protected The only protection for intel_context.gem_cotext is granted by RCU, so annotate it as a rcu protected pointer and carefully dereference it in the few occasions we need to use it. Fixes: 9f3ccd40acf4 ("drm/i915: Drop GEM context as a direct link from i915_request") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222233558.2201901-1-chris@chris-wilson.co.uk
|
#
e26b6d43 |
|
21-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Pull GT initialisation under intel_gt_init() Begin pulling the GT setup underneath a single GT umbrella; let intel_gt take ownership of its engines! As hinted, the complication is the lifetime of the probed engine versus the active lifetime of the GT backends. We need to detect the engine layout early and keep it until the end so that we can sanitize state on takeover and release. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222120752.1368352-1-chris@chris-wilson.co.uk
|
#
e6ba7648 |
|
21-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Remove i915->kernel_context Allocate only an internal intel_context for the kernel_context, forgoing a global GEM context for internal use as we only require a separate address space (for our own protection). Now having weaned GT from requiring ce->gem_context, we can stop referencing it entirely. This also means we no longer have to create random and unnecessary GEM contexts for internal use. GEM contexts are now entirely for tracking GEM clients, and intel_context the execution environment on the GPU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk
|
#
9f3ccd40 |
|
20-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Drop GEM context as a direct link from i915_request Keep the intel_context as being the primary state for i915_request, with the GEM context a backpointer from the low level state for the rarer cases we need client information. Our goal is to remove such references to clients from the backend, and leave the HW submission agnostic to client interfaces and self-contained. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk
|
#
54400257 |
|
17-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Remove direct invocation of breadcrumb signaling Only signal the breadcrumbs from inside the irq_work, simplifying our interface and calling conventions. The micro-optimisation here is that by always using the irq_work interface, we know we are always inside an irq-off critical section for the breadcrumb signaling and can ellide save/restore of the irq flags. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191217095642.3124521-7-chris@chris-wilson.co.uk
|
#
639f2f24 |
|
13-Dec-2019 |
Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> |
drm/i915: Introduce new macros for tracing New macros ENGINE_TRACE(), CE_TRACE(), RQ_TRACE() and GT_TRACE() are introduce to tag device name and engine name with contexts and requests tracing in i915. Cc: Sudeep Dutt <sudeep.dutt@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-2-venkata.s.dhanalakota@intel.com
|
#
3c9abe88 |
|
05-Dec-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915/guc: kill the GuC client We now only use 1 client without any plan to add more. The client is also only holding information about the WQ and the process desc, so we can just move those in the intel_guc structure and always use stage_id 0. v2: fix comment (John) v3: fix the comment for real, fix kerneldoc Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191205220243.27403-4-daniele.ceraolospurio@intel.com
|
#
cc662126 |
|
03-Dec-2019 |
Abdiel Janulgue <abdiel.janulgue@linux.intel.com> |
drm/i915: Introduce DRM_I915_GEM_MMAP_OFFSET This is really just an alias of mmap_gtt. The 'mmap offset' nomenclature comes from the value returned by this ioctl which is the offset into the device fd which userpace uses with mmap(2). mmap_gtt was our initial mmap_offset implementation, this extends our CPU mmap support to allow additional fault handlers that depends on the object's backing pages. Note that we multiplex mmap_gtt and mmap_offset through the same ioctl, and use the zero extending behaviour of drm to differentiate between them, when we inspect the flags. To support multiple mmap types on an object we need to support multiple mmap_offsets for an object (each offset in the global device address space corresponding to a unique instance of the object for a file + mmap type). As we drop the simplified drm core idea of a single mmap_offset, we need to provide replacement hooks for the dumb mmap interface as well. Link: https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1675 Testcase: igt/gem_mmap_offset Signed-off-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191204120032.3682839-1-chris@chris-wilson.co.uk
|
#
ee33baa8 |
|
19-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Mark up the calling context for intel_wakeref_put() Previously, we assumed we could use mutex_trylock() within an atomic context, falling back to a worker if contended. However, such trickery is illegal inside interrupt context, and so we need to always use a worker under such circumstances. As we normally are in process context, we can typically use a plain mutex, and only defer to a work when we know we are being called from an interrupt path. Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref") References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") References: https://bugs.freedesktop.org/show_bug.cgi?id=111626 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk (cherry picked from commit 07779a76ee1f93f930cf697b22be73d16e14f50c) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
88cec497 |
|
20-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Declare timeline.lock to be irq-free Now that we never allow the intel_wakeref callbacks to be invoked from interrupt context, we do not need the irqsafe spinlock for the timeline. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120170858.3965380-1-chris@chris-wilson.co.uk
|
#
07779a76 |
|
19-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Mark up the calling context for intel_wakeref_put() Previously, we assumed we could use mutex_trylock() within an atomic context, falling back to a worker if contended. However, such trickery is illegal inside interrupt context, and so we need to always use a worker under such circumstances. As we normally are in process context, we can typically use a plain mutex, and only defer to a work when we know we are being called from an interrupt path. Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref") References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts") References: https://bugs.freedesktop.org/show_bug.cgi?id=111626 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk
|
#
e8887bb3 |
|
11-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Cancel context if it hangs after it is closed If we detect a hang in a closed context, just flush all of its requests and cancel any remaining execution along the context. Note that after closing the context, the last reference to the context may be dropped, leaving it only valid under RCU. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-5-chris@chris-wilson.co.uk
|
#
dfd9c1b4 |
|
11-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Show guilty context name on GPU reset We mention that we are resetting the GPU, and dump the device state for post mortem debugging. However, while that dump contains the active processes and the one flagged as causing the error, we do not always include that information in dmesg. Include the name of the guilty process in dmesg for reference. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-4-chris@chris-wilson.co.uk
|
#
058179e7 |
|
23-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Replace hangcheck by heartbeats Replace sampling the engine state every so often with a periodic heartbeat request to measure the health of an engine. This is coupled with the forced-preemption to allow long running requests to survive so long as they do not block other users. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk
|
#
5d904e3c |
|
17-Oct-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Pass in intel_gt at some for_each_engine sites Where the function, or code segment, operates on intel_gt, we need to start passing it instead of i915 to for_each_engine(_masked). This is another partial step in migration of i915->engines[] to gt->engines[]. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191017094500.21831-2-tvrtko.ursulin@linux.intel.com
|
#
a50134b1 |
|
17-Oct-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Make for_each_engine_masked work on intel_gt Medium term goal is to eliminate the i915->engine[] array and to get there we have recently introduced equivalent array in intel_gt. Now we need to migrate the code further towards this state. This next step is to eliminate usage of i915->engines[] from the for_each_engine_masked iterator. For this to work we also need to use engine->id as index when populating the gt->engine[] array and adjust the default engine set indexing to use engine->legacy_idx instead of assuming gt->engines[] indexing. v2: * Populate gt->engine[] earlier. * Check that we don't duplicate engine->legacy_idx v3: * Work around the initialization order issue between default_engines() and intel_engines_driver_register() which sets engine->legacy_idx for now. It will be fixed properly later. v4: * Merge with forgotten v2.5. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191017161852.8836-1-tvrtko.ursulin@linux.intel.com
|
#
e9d4c924 |
|
16-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Store i915_ggtt as the backpointer on fence registers Now that i915_ggtt knows everything about its own paths to perform mmio, we can use that as our primary backpointer for individual fence registers. This reduces the amount of pointer dancing we have to perform on the common paths, but more importantly finishes our fence register encapsulation. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191016143234.4075-1-chris@chris-wilson.co.uk
|
#
542a5c66 |
|
02-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Warn CI about an unrecoverable wedge If we have a wedged GPU that we need to recover, but fail, add a taint for CI to pickup and schedule a reboot. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Petri Latvala <petri.latvala@intel.com> Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191002160034.5121-1-chris@chris-wilson.co.uk
|
#
68184eb7 |
|
23-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Fixup preempt-to-busy vs reset of a virtual request Due to the nature of preempt-to-busy the execlists active tracking and the schedule queue may become temporarily desync'ed (between resubmission to HW and its ack from HW). This means that we may have unwound a request and passed it back to the virtual engine, but it is still inflight on the HW and may even result in a GPU hang. If we detect that GPU hang and try to reset, the hanging request->engine will no longer match the current engine, which means that the request is not on the execlists active list and we should not try to find an older incomplete request. Given that we have deduced this must be a request on a virtual engine, it is the single active request in the context and so must be guilty (as the context is still inflight, it is prevented from being executed on another engine as we process the reset). Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923152844.8914-2-chris@chris-wilson.co.uk (cherry picked from commit cb2377a919bbe8107af269c5a31a8d5cfb27d867) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
cd6a8513 |
|
07-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Prefer local path to runtime powermanagement Avoid going to the base i915 device when we already have a path from gt to the runtime powermanagement interface. The benefit is that it looks a bit more self-consistent to always be acquiring the gt->uncore->rpm for use with the gt->uncore. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191007154531.1750-1-chris@chris-wilson.co.uk
|
#
b9dcb97b |
|
07-Oct-2019 |
Colin Ian King <colin.king@canonical.com> |
drm/i915: make array hw_engine_mask static, makes object smaller Don't populate the array hw_engine_mask on the stack but instead make it static. Makes the object code smaller by 316 bytes. Before: text data bss dec hex filename 34004 4388 320 38712 9738 gpu/drm/i915/gt/intel_reset.o After: text data bss dec hex filename 33528 4548 320 38396 95fc gpu/drm/i915/gt/intel_reset.o (gcc version 9.2.1, amd64) Signed-off-by: Colin Ian King <colin.king@canonical.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191007154151.23245-1-colin.king@canonical.com
|
#
fda9fa19 |
|
12-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Don't mix srcu tag and negative error codes While srcu may use an integer tag, it does not exclude potential error codes and so may overlap with our own use of -EINTR. Use a separate outparam to store the tag, and report the error code separately. Fixes: 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190912160834.30601-1-chris@chris-wilson.co.uk (cherry picked from commit eebab60f224fcfd560957715d08c31564d8672ed) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
cb5eb072 |
|
04-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/overlay: Drop struct_mutex guard The overlay uses the modeset mutex to control itself and only required the struct_mutex for requests, which is now obsolete. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-16-chris@chris-wilson.co.uk
|
#
b1e3177b |
|
04-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Coordinate i915_active with its own mutex Forgo the struct_mutex serialisation for i915_active, and interpose its own mutex handling for active/retire. This is a multi-layered sleight-of-hand. First, we had to ensure that no active/retire callbacks accidentally inverted the mutex ordering rules, nor assumed that they were themselves serialised by struct_mutex. More challenging though, is the rule over updating elements of the active rbtree. Instead of the whole i915_active now being serialised by struct_mutex, allocations/rotations of the tree are serialised by the i915_active.mutex and individual nodes are serialised by the caller using the i915_timeline.mutex (we need to use nested spinlocks to interact with the dma_fence callback lists). The pain point here is that instead of a single mutex around execbuf, we now have to take a mutex for active tracker (one for each vma, context, etc) and a couple of spinlocks for each fence update. The improvement in fine grained locking allowing for multiple concurrent clients (eventually!) should be worth it in typical loads. v2: Add some comments that barely elucidate anything :( Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-6-chris@chris-wilson.co.uk
|
#
1d6f1d16 |
|
27-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Only unwedge if we can reset first Unwedging the GPU requires a successful GPU reset before we restore the default submission, or else we may see residual context switch events that we were not expecting. v2: Pull in the special-case reset_clobbers_display, and explain why it should be safe in the context of unwedging. v3: Just forget all about resets before unwedging if it will clobber the display; risk it all. Reported-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v1 Link: https://patchwork.freedesktop.org/patch/msgid/20190927160335.10622-1-chris@chris-wilson.co.uk
|
#
4abc6e7c |
|
27-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Provide a mock GPU reset routine For those mock tests that may wish to pretend triggering a GPU reset and processing the cleanup. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190927211749.2181-3-chris@chris-wilson.co.uk
|
#
260e6b71 |
|
27-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Pass intel_gt to has-reset? As we execute GPU resets on a gt/ basis, and use the intel_gt as the primary for all other reset functions, also use it for the has-reset? predicates. Gradually simplifying the churn of pointers. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190927211749.2181-1-chris@chris-wilson.co.uk
|
#
e3792238 |
|
25-Sep-2019 |
Sebastian Andrzej Siewior <bigeasy@linutronix.de> |
drm/i915: Don't disable interrupts for intel_engine_breadcrumbs_irq() The function intel_engine_breadcrumbs_irq() is always invoked from an interrupt handler and for that reason it invokes (as an optimisation) only spin_lock() for locking assuming that the interrupts are already disabled. The function intel_engine_signal_breadcrumbs() is provided to disable interrupts while the former function is invoked so that assumption is also true for callers from preemptible context. On PREEMPT_RT local_irq_disable() really disables interrupts and this forbids to invoke spin_lock() which becomes a sleeping spinlock. This is also problematic with `threadirqs' in conjunction with irq_work. With force threading the interrupt handler, the handler is invoked with disabled BH but with interrupts enabled. This is okay and the lock itself is never acquired in IRQ context. This changes with irq_work (signal_irq_work()) which _still_ invokes intel_engine_breadcrumbs_irq() from IRQ context. Lockdep should see this and complain. Acquire the locks in intel_engine_breadcrumbs_irq() with _irqsave() suffix and let all callers invoke intel_engine_breadcrumbs_irq() directly instead using intel_engine_signal_breadcrumbs(). Reported-by: Clark Williams <williams@redhat.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190926105644.16703-2-bigeasy@linutronix.de
|
#
5311f517 |
|
26-Sep-2019 |
Michał Winiarski <michal.winiarski@intel.com> |
drm/i915: Define explicit wedged on init reset state We're currently using scratch presence as a way of identifying that we entered wedged state at driver initialization time. Let's use a separate flag rather than rely on scratch. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190926133142.2838-1-chris@chris-wilson.co.uk
|
#
cb2377a9 |
|
23-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Fixup preempt-to-busy vs reset of a virtual request Due to the nature of preempt-to-busy the execlists active tracking and the schedule queue may become temporarily desync'ed (between resubmission to HW and its ack from HW). This means that we may have unwound a request and passed it back to the virtual engine, but it is still inflight on the HW and may even result in a GPU hang. If we detect that GPU hang and try to reset, the hanging request->engine will no longer match the current engine, which means that the request is not on the execlists active list and we should not try to find an older incomplete request. Given that we have deduced this must be a request on a virtual engine, it is the single active request in the context and so must be guilty (as the context is still inflight, it is prevented from being executed on another engine as we process the reset). Fixes: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190923152844.8914-2-chris@chris-wilson.co.uk
|
#
0d333ac7 |
|
18-Sep-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: fix SFC reset flow Our assumption that the we can ask the HW to lock the SFC even if not currently in use does not match the HW commitment. The expectation from the HW is that SW will not try to lock the SFC if the engine is not using it and if we do that the behavior is undefined; on ICL the HW ends up to returning the ack and ignoring our lock request, but this is not guaranteed and we shouldn't expect it going forward. Also, failing to get the ack while the SFC is in use means that we can't cleanly reset it, so fail the engine reset in that scenario. v2: drop rmw change, keep the log as debug and handle failure (Chris), improve comments (Tvrtko). Reported-by: Owen Zhang <owen.zhang@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190919015330.15435-1-daniele.ceraolospurio@intel.com
|
#
eebab60f |
|
12-Sep-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Don't mix srcu tag and negative error codes While srcu may use an integer tag, it does not exclude potential error codes and so may overlap with our own use of -EINTR. Use a separate outparam to store the tag, and report the error code separately. Fixes: 2caffbf11762 ("drm/i915: Revoke mmaps and prevent access to fence registers across reset") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190912160834.30601-1-chris@chris-wilson.co.uk
|
#
61fa60ff |
|
10-Sep-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Move GT init to intel_gt.c Code in i915_gem_init_hw is all about GT init so move it to intel_gt.c renaming to intel_gt_init_hw. Existing intel_gt_init_hw is renamed to intel_gt_init_hw_early since it is currently called from driver probe. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190910143823.10686-2-tvrtko.ursulin@linux.intel.com
|
#
ff36c5c4 |
|
23-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Hold irq-off for the entire fake lock period Sadly lockdep records when the irqs are re-enabled and then marks up the fake lock as being irq-unsafe. Our hand is forced and so we must mark up the entire fake lock critical section as irq-off. Hopefully this is the last tweak required! v2: Not quite, we need to mark the timeline spinlock as irqsafe. That was a genuine bug being hidden by the earlier lockdep splat. Fixes: d67739268cf0 ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk (cherry picked from commit 6dcb85a0ad990455ae7c596e3fc966ad9c1ba9c5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
6dcb85a0 |
|
23-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Hold irq-off for the entire fake lock period Sadly lockdep records when the irqs are re-enabled and then marks up the fake lock as being irq-unsafe. Our hand is forced and so we must mark up the entire fake lock critical section as irq-off. Hopefully this is the last tweak required! v2: Not quite, we need to mark the timeline spinlock as irqsafe. That was a genuine bug being hidden by the earlier lockdep splat. Fixes: d67739268cf0 ("drm/i915/gt: Mark up the nested engine-pm timeline lock as irqsafe") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190823132700.25286-2-chris@chris-wilson.co.uk
|
#
338aade9 |
|
15-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Convert timeline tracking to spinlock Convert the active_list manipulation of timelines to use spinlocks so that we can perform the updates from underneath a quick interrupt callback, if need be. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-2-chris@chris-wilson.co.uk
|
#
1d455f8d |
|
06-Aug-2019 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: rename intel_drv.h to display/intel_display_types.h Everything about the file is about display, and mostly about types related to display. Move under display/ as intel_display_types.h to reflect the facts. There's still plenty to clean up, but start off with moving the file where it logically belongs and naming according to contents. v2: fix the include guard name in the renamed file Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190806113933.11799-1-jani.nikula@intel.com
|
#
c29579d2 |
|
06-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Make caps.scheduler static We do not notify userspace when the scheduler capabilities are changed (due to wedging the driver) and as such userspace will expect the caps to be static and unchanging. Make it so, and so we only need to compute our caps once during driver registration. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-1-chris@chris-wilson.co.uk
|
#
4b9bb972 |
|
26-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Only recover active engines If we issue a reset to a currently idle engine, leave it idle afterwards. This is useful to excise a linkage between reset and the shrinker. When waking the engine, we need to pin the default context image which we use for overwriting a guilty context -- if the engine is idle we do not need this pinned image! However, this pinning means that waking the engine acquires the FS_RECLAIM, and so may trigger the shrinker. The shrinker itself may need to wait upon the GPU to unbind and object and so may require services of reset; ergo we should avoid the engine wake up path. The danger in skipping the recovery for idle engines is that we leave the engine with no context defined, which may interfere with the operation of the power context on some older platforms. In practice, we should only be resetting an active GPU but it something to look out for on Ironlake (if memory serves). Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-2-chris@chris-wilson.co.uk (cherry picked from commit 18398904ca9e3ddd180e2ecd45886e146b1d9d5b) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
0de50e40 |
|
26-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Lift intel_engines_resume() to callers Since the reset path wants to recover the engines itself, it only wants to reinitialise the hardware using i915_gem_init_hw(). Pull the call to intel_engines_resume() to the module init/resume path so we can avoid it during reset. Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-3-chris@chris-wilson.co.uk (cherry picked from commit 092be382a2602067766f190a113514d469162456) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
702668e6 |
|
24-Jul-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915/uc: Unify uC platform check We have several HAS_* checks for GuC and HuC but we mostly use HAS_GUC and HAS_HUC, with only 1 exception. Since our HW always has either both uC or neither of them, just replace all the checks with a unified HAS_UC. v2: use HAS_GT_UC (Michal) v3: fix comment (Michal) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Acked-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190725001813.4740-2-daniele.ceraolospurio@intel.com
|
#
c30d5dc6 |
|
16-Jul-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Push engine stopping into reset-prepare Push the engine stop into the back reset_prepare (where it already was!) This allows us to avoid dangerously setting the RING registers to 0 for logical contexts. If we clear the register on a live context, those invalid register values are recorded in the logical context state and replayed (with hilarious results). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190716124931.5870-2-chris@chris-wilson.co.uk
|
#
ca7b2c1b |
|
13-Jul-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915/uc: Move intel functions to intel_uc All the intel_uc_* can now be moved to work on the intel_uc structure for better encapsulation of uc-related actions. Note: I've introduced uc_to_gt instead of uc_to_i915 because the aim is to move everything to be gt-focused in the medium term, so we would've had to replace it soon anyway. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Acked-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-8-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
#
8b5689d7 |
|
13-Jul-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915/uc: move GuC/HuC inside intel_gt under a new intel_uc Being part of the GT HW, it make sense to keep the guc/huc structures inside the GT structure. To help with the encapsulation work done by the following patches, both structures are placed inside a new intel_uc container. Although this results in code with ugly nested dereferences (i915->gt.uc.guc...), it saves us the extra work required in moving the structures twice (i915 -> gt -> uc). The following patches will reduce the number of places where we try to access the guc/huc structures directly from i915 and reduce the ugliness. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-7-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
#
0f261b24 |
|
13-Jul-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915/uc: move GuC and HuC files under gt/uc/ Both microcontrollers are part of the GT HW and are closely related to GT operations. To keep all the files cleanly together, they've been placed in their own subdir inside the gt/ folder Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190713100016.8026-6-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
#
cb823ed9 |
|
12-Jul-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Use intel_gt as the primary object for handling resets Having taken the first step in encapsulating the functionality by moving the related files under gt/, the next step is to start encapsulating by passing around the relevant structs rather than the global drm_i915_private. In this step, we pass intel_gt to intel_reset.c Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk
|
#
092be382 |
|
26-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Lift intel_engines_resume() to callers Since the reset path wants to recover the engines itself, it only wants to reinitialise the hardware using i915_gem_init_hw(). Pull the call to intel_engines_resume() to the module init/resume path so we can avoid it during reset. Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-3-chris@chris-wilson.co.uk
|
#
18398904 |
|
26-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Only recover active engines If we issue a reset to a currently idle engine, leave it idle afterwards. This is useful to excise a linkage between reset and the shrinker. When waking the engine, we need to pin the default context image which we use for overwriting a guilty context -- if the engine is idle we do not need this pinned image! However, this pinning means that waking the engine acquires the FS_RECLAIM, and so may trigger the shrinker. The shrinker itself may need to wait upon the GPU to unbind and object and so may require services of reset; ergo we should avoid the engine wake up path. The danger in skipping the recovery for idle engines is that we leave the engine with no context defined, which may interfere with the operation of the power context on some older platforms. In practice, we should only be resetting an active GPU but it something to look out for on Ironlake (if memory serves). Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190626154549.10066-2-chris@chris-wilson.co.uk
|
#
5f22e5b3 |
|
25-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Rename intel_wakeref_[is]_active Our general rule is to use is/has as the verb for boolean functions, rename intel_wakeref_active to intel_wakeref_is_active so the question being asked is clear. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-6-chris@chris-wilson.co.uk
|
#
0c91621c |
|
25-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Pass intel_gt to pm routines Switch from passing the i915 container to newly named struct intel_gt. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-2-chris@chris-wilson.co.uk
|
#
f0c02c1b |
|
21-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Rename i915_timeline to intel_timeline and move under gt Move all timeline code under gt and rename to intel_gt prefix. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com
|
#
eaf522f6 |
|
21-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Make i915_check_and_clear_faults take intel_gt Continuing the conversion and elimination of implicit dev_priv. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-6-tvrtko.ursulin@linux.intel.com
|
#
df0566a6 |
|
13-Jun-2019 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: move modesetting core code under display/ Now that we have a new subdirectory for display code, continue by moving modesetting core code. display/intel_frontbuffer.h sticks out like a sore thumb, otherwise this is, again, a surprisingly clean operation. v2: - don't move intel_sideband.[ch] (Ville) - use tabs for Makefile file lists and sort them Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Acked-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190613084416.6794-3-jani.nikula@intel.com
|
#
422d7df4 |
|
14-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Replace engine->timeline with a plain list To continue the onslaught of removing the assumption of a global execution ordering, another casualty is the engine->timeline. Without an actual timeline to track, it is overkill and we can replace it with a much less grand plain list. We still need a list of requests inflight, for the simple purpose of finding inflight requests (for retiring, resetting, preemption etc). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
|
#
c447ff7d |
|
13-Jun-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: update with_intel_runtime_pm to use the rpm structure Matching the underlying get/put functions. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-8-daniele.ceraolospurio@intel.com
|
#
d858d569 |
|
13-Jun-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: update rpm_get/put to use the rpm structure The functions where internally already only using the structure, so we need to just flip the interface. v2: rebase Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Imre Deak <imre.deak@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com
|
#
84383d2e |
|
14-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Refine i915_reset.lock_map We already use a mutex to serialise i915_reset() and wedging, so all we need it to link that into i915_request_wait() and we have our lock cycle detection. v2.5: Take error mutex for selftests Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614071023.17929-3-chris@chris-wilson.co.uk
|
#
0cf289bd |
|
13-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move fence register tracking from i915->mm to ggtt As the fence registers only apply to regions inside the GGTT is makes more sense that we track these as part of the i915_ggtt and not the general mm. In the next patch, we will then pull the register locking underneath the i915_ggtt.mutex. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190613073254.24048-1-chris@chris-wilson.co.uk
|
#
33df8a76 |
|
12-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Prevent lock-cycles between GPU waits and GPU resets We cannot allow ourselves to wait on the GPU while holding any lock as we may need to reset the GPU. While there is not an explicit lock between the two operations, lockdep cannot detect the dependency. So let's tell lockdep about the wait/reset dependency with an explicit lockmap. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190612085246.16374-1-chris@chris-wilson.co.uk
|
#
6a8cc66f |
|
06-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Move i915_check_and_clear_faults to intel_reset.c The code is logically about reset so it makes sense. It also enables making i915_clear_error_registers static. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190607115932.20271-1-tvrtko.ursulin@linux.intel.com
|
#
f736ae1b |
|
07-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Extract engine fault reset to a helper Just tidying the flow a bit. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-4-tvrtko.ursulin@linux.intel.com
|
#
77a302e0 |
|
07-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Make Gen6/7 RING_FAULT_REG access engine centric Similar to earlier conversions, eliminate the implicit dev_priv by introducing some helpers which take the engine parameter (since the register itself is per engine). v2: * Always use parentheses in macro arguments. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190607101535.767-1-tvrtko.ursulin@linux.intel.com
|
#
b61ea001 |
|
07-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Reset only affected engines when handling error capture Pass down the engine mask to i915_clear_error_registers so only affected engines can be reset on the Gen6/7 path. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190607082557.31670-1-tvrtko.ursulin@linux.intel.com
|
#
10be98a7 |
|
28-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move more GEM objects under gem/ Continuing the theme of separating out the GEM clutter. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk
|
#
932309fb |
|
22-May-2019 |
Michal Wajdeczko <michal.wajdeczko@intel.com> |
drm/i915/selftests: Move some reset testcases to separate file igt_global_reset and igt_wedged_reset testcases are first candidates. Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522193203.23932-2-michal.wajdeczko@intel.com
|
#
18ecc6c5 |
|
07-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Reboot CI if forcewake fails If the HW fails to ack a change in forcewake status, the machine is as good as dead -- it may recover, but in reality it missed the mmio updates and is now in a very inconsistent state. If it happens, we can't trust the CI results (or at least the fails may be genuine but due to the HW being dead and not the actual test!) so reboot the machine (CI checks for a kernel taint in between each test and reboots if the machine is tainted). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190508115245.27790-1-chris@chris-wilson.co.uk
|
#
05ca9306 |
|
29-Apr-2019 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: extract intel_overlay.h from intel_drv.h and i915_drv.h It used to be handy that we only had a couple of headers, but over time intel_drv.h has become unwieldy. Extract declarations to a separate header file corresponding to the implementation module, clarifying the modularity of the driver. Ensure the new header is self-contained, and do so with minimal further includes, using forward declarations as needed. Include the new header only where needed, and sort the modified include directives while at it and as needed. No functional changes. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/2e4fb1e67ed38870df3040bb0a1b1a58fd90cc86.1556540890.git.jani.nikula@intel.com
|
#
440e2b3d |
|
29-Apr-2019 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: extract i915_irq.h from intel_drv.h and i915_drv.h It used to be handy that we only had a couple of headers, but over time intel_drv.h has become unwieldy. Extract declarations to a separate header file corresponding to the implementation module, clarifying the modularity of the driver. Ensure the new header is self-contained, and do so with minimal further includes, using forward declarations as needed. Include the new header only where needed, and sort the modified include directives while at it and as needed. No functional changes. Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/64e46278dc8dccc9c548ef453cb2ceece5367bb2.1556540890.git.jani.nikula@intel.com
|
#
79ffac85 |
|
24-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Invert the GEM wakeref hierarchy In the current scheme, on submitting a request we take a single global GEM wakeref, which trickles down to wake up all GT power domains. This is undesirable as we would like to be able to localise our power management to the available power domains and to remove the global GEM operations from the heart of the driver. (The intent there is to push global GEM decisions to the boundary as used by the GEM user interface.) Now during request construction, each request is responsible via its logical context to acquire a wakeref on each power domain it intends to utilize. Currently, each request takes a wakeref on the engine(s) and the engines themselves take a chipset wakeref. This gives us a transition on each engine which we can extend if we want to insert more powermangement control (such as soft rc6). The global GEM operations that currently require a struct_mutex are reduced to listening to pm events from the chipset GT wakeref. As we reduce the struct_mutex requirement, these listeners should evaporate. Perhaps the biggest immediate change is that this removes the struct_mutex requirement around GT power management, allowing us greater flexibility in request construction. Another important knock-on effect, is that by tracking engine usage, we can insert a switch back to the kernel context on that engine immediately, avoiding any extra delay or inserting global synchronisation barriers. This makes tracking when an engine and its associated contexts are idle much easier -- important for when we forgo our assumed execution ordering and need idle barriers to unpin used contexts. In the process, it means we remove a large chunk of code whose only purpose was to switch back to the kernel context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
|
#
2ccdf6a1 |
|
24-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Pass intel_context to i915_request_create() Start acquiring the logical intel_context and using that as our primary means for request allocation. This is the initial step to allow us to avoid requiring struct_mutex for request allocation along the perma-pinned kernel context, but it also provides a foundation for breaking up the complex request allocation to handle different scenarios inside execbuf. For the purpose of emitting a request from inside retirement (see the next patch for engine power management), we also need to lift control over the timeline mutex to the caller. v2: Note that the request carries the active reference upon construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-4-chris@chris-wilson.co.uk
|
#
112ed2d3 |
|
24-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move GraphicsTechnology files under gt/ Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
|