#
5e4e06e4 |
|
30-Oct-2023 |
Andrzej Hajda <andrzej.hajda@intel.com> |
drm/i915: Track gt pm wakerefs Track every intel_gt_pm_get() until its corresponding release in intel_gt_pm_put() by returning a cookie to the caller for acquire that must be passed by on released. When there is an imbalance, we can see who either tried to free a stale wakeref, or who forgot to free theirs. v2: track recently added calls in gen8_ggtt_bind_get_ce and destroyed_worker_func Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231030-ref_tracker_i915-v1-2-006fe6b96421@intel.com
|
#
4485bd51 |
|
12-Sep-2023 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
i915/pmu: Move execlist stats initialization to execlist specific setup engine->stats is a union of execlist and guc stat objects. When execlist specific fields are initialized, the initial state of guc stats is affected. This results in bad busyness values when using GuC mode. Move the execlist initialization from common code to execlist specific code. Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230912212247.1828681-1-umesh.nerlige.ramappa@intel.com
|
#
28c46fee |
|
21-Aug-2023 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Consolidate condition for Wa_22011802037 The workaround bounds for Wa_22011802037 are somewhat complex and are replicated in several places throughout the code. Pull the condition out to a helper function to prevent mistakes if this condition needs to change again in the future. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-12-matthew.d.roper@intel.com
|
#
c524cd40 |
|
12-Sep-2023 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
i915/pmu: Move execlist stats initialization to execlist specific setup engine->stats is a union of execlist and guc stat objects. When execlist specific fields are initialized, the initial state of guc stats is affected. This results in bad busyness values when using GuC mode. Move the execlist initialization from common code to execlist specific code. Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu") Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230912212247.1828681-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 4485bd519f5d6d620a29d0547ff3c982bdeeb468) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
d3f23ab9 |
|
20-Jul-2023 |
Andrzej Hajda <andrzej.hajda@intel.com> |
drm/i915: use direct alias for i915 in requests i915_request contains direct alias to i915, there is no point to go via rq->engine->i915. v2: added missing rq.i915 initialization in measure_breadcrumb_dw. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230720113002.1541572-1-andrzej.hajda@intel.com
|
#
72e9abc3 |
|
26-Jun-2023 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/uncore: add intel_uncore_regs() helper Add a helper for accessing uncore->regs instead of doing it directly. This will help display code reuse with the xe driver. Cc: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230627095128.208071-1-jani.nikula@intel.com
|
#
848a4e5c |
|
08-Jun-2023 |
Luca Coelho <luciano.coelho@intel.com> |
drm/i915: add a dedicated workqueue inside drm_i915_private In order to avoid flush_scheduled_work() usage, add a dedicated workqueue in the drm_i915_private structure. In this way, we don't need to use the system queue anymore. This change is mostly mechanical and based on Tetsuo's original patch[1]. v6 by Jani: - Also create unordered_wq for mock device Link: https://patchwork.freedesktop.org/series/114608/ [1] Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/c816ebe17ef08d363981942a096a586a7658a65e.1686231190.git.jani.nikula@intel.com
|
#
b3e70051 |
|
20-Mar-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Fix context runtime accounting When considering whether to mark one context as stopped and another as started we need to look at whether the previous and new _contexts_ are different and not just requests. Otherwise the software tracked context start time was incorrectly updated to the most recent lite-restore time- stamp, which was in some cases resulting in active time going backward, until the context switch (typically the heartbeat pulse) would synchronise with the hardware tracked context runtime. Easiest use case to observe this behaviour was with a full screen clients with close to 100% engine load. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: bb6287cb1886 ("drm/i915: Track context current active time") Cc: <stable@vger.kernel.org> # v5.19+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230320151423.1708436-1-tvrtko.ursulin@linux.intel.com [tursulin: Fix spelling in commit msg.]
|
#
dc342156 |
|
20-Mar-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Fix context runtime accounting When considering whether to mark one context as stopped and another as started we need to look at whether the previous and new _contexts_ are different and not just requests. Otherwise the software tracked context start time was incorrectly updated to the most recent lite-restore time- stamp, which was in some cases resulting in active time going backward, until the context switch (typically the heartbeat pulse) would synchronise with the hardware tracked context runtime. Easiest use case to observe this behaviour was with a full screen clients with close to 100% engine load. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: bb6287cb1886 ("drm/i915: Track context current active time") Cc: <stable@vger.kernel.org> # v5.19+ Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230320151423.1708436-1-tvrtko.ursulin@linux.intel.com [tursulin: Fix spelling in commit msg.] (cherry picked from commit b3e70051879c665acdd3a1ab50d0ed58d6a8001f) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
25746a3f |
|
06-Feb-2023 |
Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
drm/i915: fix up merge with usb-next branch In the manual fixup of the list_count_nodes() logic in drivers/gpu/drm/i915/gt/intel_execlists_submission.c in the usb-next branch, I missed that the print modifier was incorrect, resulting in loads of build warnings on 32bit systems. Fix this up by using "%su" instead of "%lu". Reported-by: kernel test robot <lkp@intel.com> Fixes: 924fb3ec50f5 ("Merge 6.2-rc7 into usb-next") Link: https://lore.kernel.org/r/20230206124422.2266892-1-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
#
a4be3dca |
|
26-Jan-2023 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915: Fix up locking around dumping requests lists The debugfs dump of requests was confused about what state requires the execlist lock versus the GuC lock. There was also a bunch of duplicated messy code between it and the error capture code. So refactor the hung request search into a re-usable function. And reduce the span of the execlist state lock to only the execlist specific code paths. In order to do that, also move the report of hold count (which is an execlist only concept) from the top level dump function to the lower level execlist specific function. Also, move the execlist specific code into the execlist source file. v2: Rename some functions and move to more appropriate files (Daniele). v3: Rename new execlist dump function (Daniele) Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com
|
#
41bb543f |
|
05-Jan-2023 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/mtl: Add initial gt workarounds This patch introduces initial gt workarounds for the MTL platform. v2: drop redundant/stale comments specifying wa platforms affected (Lucas). v3: drop additional redundant stale comments (MattR) Bspec: 66622 Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230105234408.277750-1-matthew.s.atwood@intel.com
|
#
5bc4b43d |
|
26-Jan-2023 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915: Fix up locking around dumping requests lists The debugfs dump of requests was confused about what state requires the execlist lock versus the GuC lock. There was also a bunch of duplicated messy code between it and the error capture code. So refactor the hung request search into a re-usable function. And reduce the span of the execlist state lock to only the execlist specific code paths. In order to do that, also move the report of hold count (which is an execlist only concept) from the top level dump function to the lower level execlist specific function. Also, move the execlist specific code into the execlist source file. v2: Rename some functions and move to more appropriate files (Daniele). v3: Rename new execlist dump function (Daniele) Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC") Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Cc: Michael Cheng <michael.cheng@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: Bruce Chang <yu.bruce.chang@intel.com> Cc: Alan Previn <alan.previn.teres.alexis@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com (cherry picked from commit a4be3dca53172d9d2091e4b474fb795c81ed3d6c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
8032bf12 |
|
09-Oct-2022 |
Jason A. Donenfeld <Jason@zx2c4.com> |
treewide: use get_random_u32_below() instead of deprecated function This is a simple mechanical transformation done by: @@ expression E; @@ - prandom_u32_max + get_random_u32_below (E) Reviewed-by: Kees Cook <keescook@chromium.org> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs Reviewed-by: SeongJae Park <sj@kernel.org> # for damon Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
#
801543b2 |
|
09-Nov-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: stop including i915_irq.h from i915_trace.h Turns out many of the files that need i915_reg.h get it implicitly via {display/intel_de.h, gt/intel_context.h} -> i915_trace.h -> i915_irq.h -> i915_reg.h. Since i915_trace.h doesn't actually need i915_irq.h, makes sense to drop it, but that requires adding quite a few new includes all over the place. Prefer including i915_reg.h where needed instead of adding another implicit include, because eventually we'll want to split up i915_reg.h and only include the specific registers at each place. Also some places actually needed i915_irq.h too. Cc: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/6e78a2e0ac1bffaf5af3b5ccc21dff05e6518cef.1668008071.git.jani.nikula@intel.com
|
#
a10234fd |
|
09-Nov-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Partial abandonment of legacy DRM logging macros Convert some usages of legacy DRM logging macros into versions which tell us on which device have the events occurred. v2: * Don't have struct drm_device as local. (Jani, Ville) v3: * Store gt, not i915, in workaround list. (John) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221109104633.2579245-1-tvrtko.ursulin@linux.intel.com
|
#
d263545e |
|
29-Sep-2022 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gt: Fix platform prefix Different handling for XeHP and later platforms should be using the xehp prefix, not gen125. Rename them. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220930050903.3479619-4-lucas.demarchi@intel.com
|
#
107ba1a2 |
|
21-Sep-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Restrict forced preemption to the active context When we submit a new pair of contexts to ELSP for execution, we start a timer by which point we expect the HW to have switched execution to the pending contexts. If the promotion to the new pair of contexts has not occurred, we declare the executing context to have hung and force the preemption to take place by resetting the engine and resubmitting the new contexts. This can lead to an unfair situation where almost all of the preemption timeout is consumed by the first context which just switches into the second context immediately prior to the timer firing and triggering the preemption reset (assuming that the timer interrupts before we process the CS events for the context switch). The second context hasn't yet had a chance to yield to the incoming ELSP (and send the ACk for the promotion) and so ends up being blamed for the reset. If we see that a context switch has occurred since setting the preemption timeout, but have not yet received the ACK for the ELSP promotion, rearm the preemption timer and check again. This is especially significant if the first context was not schedulable and so we used the shortest timer possible, greatly increasing the chance of accidentally blaming the second innocent context. Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption") Fixes: d12acee84ffb ("drm/i915/execlists: Cancel banned contexts on schedule-out") Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Andrzej Hajda <andrzej.hajda@intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com
|
#
6ef7d362 |
|
21-Sep-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Restrict forced preemption to the active context When we submit a new pair of contexts to ELSP for execution, we start a timer by which point we expect the HW to have switched execution to the pending contexts. If the promotion to the new pair of contexts has not occurred, we declare the executing context to have hung and force the preemption to take place by resetting the engine and resubmitting the new contexts. This can lead to an unfair situation where almost all of the preemption timeout is consumed by the first context which just switches into the second context immediately prior to the timer firing and triggering the preemption reset (assuming that the timer interrupts before we process the CS events for the context switch). The second context hasn't yet had a chance to yield to the incoming ELSP (and send the ACk for the promotion) and so ends up being blamed for the reset. If we see that a context switch has occurred since setting the preemption timeout, but have not yet received the ACK for the ELSP promotion, rearm the preemption timer and check again. This is especially significant if the first context was not schedulable and so we used the shortest timer possible, greatly increasing the chance of accidentally blaming the second innocent context. Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption") Fixes: d12acee84ffb ("drm/i915/execlists: Cancel banned contexts on schedule-out") Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Andrzej Hajda <andrzej.hajda@intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com (cherry picked from commit 107ba1a2c705f4358f2602ec2f2fd821bb651f42) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
0667429c |
|
21-Jun-2022 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend For execlists backend, current implementation of Wa_22011802037 is to stop the CS before doing a reset of the engine. This WA was further extended to wait for any pending MI FORCE WAKEUPs before issuing a reset. Add the extended steps in the execlist path of reset. In addition, extend the WA to gen11. v2: (Tvrtko) - Clarify comments, commit message, fix typos - Use IS_GRAPHICS_VER for gen 11/12 checks v3: (Daneile) - Drop changes to intel_ring_submission since WA does not apply to it - Log an error if MSG IDLE is not defined for an engine Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: f6aa0d713c88 ("drm/i915: Add Wa_22011802037 force cs halt") Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com
|
#
45c64ecf |
|
27-May-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Improve user experience and driver robustness under SIGINT or similar We have long standing customer complaints that pressing Ctrl-C (or to the effect of) causes engine resets with otherwise well behaving programs. Not only is logging engine resets during normal operation not desirable since it creates support incidents, but more fundamentally we should avoid going the engine reset path when we can since any engine reset introduces a chance of harming an innocent context. Reason for this undesirable behaviour is that the driver currently does not distinguish between banned contexts and non-persistent contexts which have been closed. To fix this we add the distinction between the two reasons for revoking contexts, which then allows the strict timeout only be applied to banned, while innocent contexts (well behaving) can preempt cleanly and exit without triggering the engine reset path. Note that the added context exiting category applies both to closed non- persistent context, and any exiting context when hangcheck has been disabled by the user. At the same time we rename the backend operation from 'ban' to 'revoke' which more accurately describes the actual semantics. (There is no ban at the backend level since banning is a concept driven by the scheduling frontend. Backends are simply able to revoke a running context so that is the more appropriate name chosen.) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220527072452.2225610-1-tvrtko.ursulin@linux.intel.com
|
#
8ae66490 |
|
21-May-2022 |
Julia Lawall <Julia.Lawall@inria.fr> |
drm/i915: fix typos in comments Spelling mistakes (triple letters) in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220521111145.81697-90-Julia.Lawall@inria.fr
|
#
a5c89f7c |
|
04-May-2022 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Support programming the EU priority in the GuC descriptor In GuC submission mode the EU priority must be updated by the GuC rather than the driver as the GuC owns the programming of the context descriptor. Given that the GuC code uses the GuC priorities, we can't use a generic function using i915 priorities for both execlists and GuC submission. The existing function has therefore been pushed to the execlists back-end while a new one has been added for GuC. v2: correctly use the GuC prio. Cc: John Harrison <john.c.harrison@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220504234636.2119794-1-daniele.ceraolospurio@intel.com
|
#
a7a47a5d |
|
21-Jun-2022 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend For execlists backend, current implementation of Wa_22011802037 is to stop the CS before doing a reset of the engine. This WA was further extended to wait for any pending MI FORCE WAKEUPs before issuing a reset. Add the extended steps in the execlist path of reset. In addition, extend the WA to gen11. v2: (Tvrtko) - Clarify comments, commit message, fix typos - Use IS_GRAPHICS_VER for gen 11/12 checks v3: (Daneile) - Drop changes to intel_ring_submission since WA does not apply to it - Log an error if MSG IDLE is not defined for an engine Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Fixes: f6aa0d713c88 ("drm/i915: Add Wa_22011802037 force cs halt") Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com (cherry picked from commit 0667429ce68e0b08f9f1fec8fd0b1f57228f605e) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
|
#
e7999fa1 |
|
04-May-2022 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Support programming the EU priority in the GuC descriptor In GuC submission mode the EU priority must be updated by the GuC rather than the driver as the GuC owns the programming of the context descriptor. Given that the GuC code uses the GuC priorities, we can't use a generic function using i915 priorities for both execlists and GuC submission. The existing function has therefore been pushed to the execlists back-end while a new one has been added for GuC. v2: correctly use the GuC prio. Cc: John Harrison <john.c.harrison@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220504234636.2119794-1-daniele.ceraolospurio@intel.com (cherry picked from commit a5c89f7c43c12c592a882a0ec2a15e9df0011e80) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
166c44e6 |
|
25-Apr-2022 |
Chris Wilson <chris.p.wilson@intel.com> |
drm/i915/gt: Clear SET_PREDICATE_RESULT prior to executing the ring Userspace may leave predication enabled upon return from the batch buffer, which has the consequent of preventing all operation from the ring from being executed, including all the synchronisation, coherency control, arbitration and user signaling. This is more than just a local gpu hang in one client, as the user has the ability to prevent the kernel from applying critical workarounds and can cause a full GT reset. We could simply execute MI_SET_PREDICATE upon return from the user batch, but this has the repercussion of modifying the user's context state. Instead, we opt to execute a fixup batch which by mixing predicated operations can determine the state of the SET_PREDICATE_RESULT register and restore it prior to the next userspace batch. This allows us to protect the kernel's ring without changing the uABI. Suggested-by: Zbigniew Kempczynski <zbigniew.kempczynski@intel.com> Signed-off-by: Chris Wilson <chris.p.wilson@intel.com> Cc: Zbigniew Kempczynski <zbigniew.kempczynski@intel.com> Cc: Thomas Hellstrom <thomas.hellstrom@intel.com> Signed-off-by: Ramalingam C <ramalingam.c@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220425152317.4275-4-ramalingam.c@intel.com
|
#
bb6287cb |
|
01-Apr-2022 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Track context current active time Track context active (on hardware) status together with the start timestamp. This will be used to provide better granularity of context runtime reporting in conjunction with already tracked pphwsp accumulated runtime. The latter is only updated on context save so does not give us visibility to any currently executing work. As part of the patch the existing runtime tracking data is moved under the new ce->stats member and updated under the seqlock. This provides the ability to atomically read out accumulated plus active runtime. v2: * Rename and make __intel_context_get_active_time unlocked. v3: * Use GRAPHICS_VER. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> # v1 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220401142205.3123159-6-tvrtko.ursulin@linux.intel.com
|
#
a6f0f9cf |
|
21-Mar-2022 |
Alan Previn <alan.previn.teres.alexis@intel.com> |
drm/i915/guc: Plumb GuC-capture into gpu_coredump Add a flags parameter through all of the coredump creation functions. Add a bitmask flag to indicate if the top level gpu_coredump event is triggered in response to a GuC context reset notification. Using that flag, ensure all coredump functions that read or print mmio-register values related to work submission or command-streamer engines are skipped and replaced with a calls guc-capture module equivalent functions to retrieve or print the register dump. While here, split out display related register reading and printing into its own function that is called agnostic to whether GuC had triggered the reset. For now, introduce an empty printing function that can filled in on a subsequent patch just to handle formatting. Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-13-alan.previn.teres.alexis@intel.com
|
#
61c5ed94 |
|
21-Mar-2022 |
Michael Cheng <michael.cheng@intel.com> |
drm/i915/gt: replace cache_clflush_range Replace all occurrence of cache_clflush_range with drm_clflush_virt_range. This will prevent compile errors on non-x86 platforms. Signed-off-by: Michael Cheng <michael.cheng@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-6-michael.cheng@intel.com
|
#
92b0cba4 |
|
21-Mar-2022 |
Michael Cheng <michael.cheng@intel.com> |
drm/i915/gt: Re-work reset_csb Use drm_clflush_virt_range instead of directly invoking clflush. This will prevent compiler errors when building for non-x86 architectures. v2(Michael Cheng): Remove extra clflush v3(Michael Cheng): Remove memory barrier since drm_clflush_virt_range takes care of it. v4(Michael Cheng): Get the size of value and not the size of the pointer when passing in execlists->csb_write. Thanks to Matt Roper for pointing this out. Signed-off-by: Michael Cheng <michael.cheng@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-4-michael.cheng@intel.com
|
#
dc040682 |
|
21-Mar-2022 |
Michael Cheng <michael.cheng@intel.com> |
drm/i915/gt: Drop invalidate_csb_entries Drop invalidate_csb_entries and directly call drm_clflush_virt_range. This allows for one less function call, and prevent complier errors when building for non-x86 architectures. v2(Michael Cheng): Drop invalidate_csb_entries function and directly invoke drm_clflush_virt_range. Thanks to Tvrtko for the sugguestion. v3(Michael Cheng): Use correct parameters for drm_clflush_virt_range. Thanks to Tvrtko for pointing this out. v4(Michael Cheng): Simplify &execlists->csb_status[0] to execlists->csb_status. Thanks to Matt Roper for the suggestion. Signed-off-by: Michael Cheng <michael.cheng@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220321223819.72833-3-michael.cheng@intel.com
|
#
f9576e36 |
|
03-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: Support platforms with CCS engines but no RCS In the past we've always assumed that an RCS engine is present on every platform. However now that we have compute engines there may be platforms that have CCS engines but no RCS, or platforms that are designed to have both, but have the RCS engine fused off. Various engine-centric initialization that only needs to be done a single time for the group of RCS+CCS engines can't rely on being setup with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag that will be assigned to a single engine in the group; whichever engine has this flag will be responsible for some of the general setup (RCU_MODE programming, initialization of certain workarounds, etc.). Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220303223435.2793124-1-matthew.d.roper@intel.com
|
#
01fabda8 |
|
25-Feb-2022 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: Use str_yes_no() Remove the local yesno() implementation and adopt the str_yes_no() from linux/string_helpers.h. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220225234631.3725943-1-lucas.demarchi@intel.com
|
#
87cb6d80 |
|
01-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: Enable ccs/dual-ctx in RCU_MODE We have to specify in the Render Control Unit Mode register when CCS is enabled. v2: - Move RCU_MODE programming to a helper function. (Tvrtko) - Clean up and clarify comments. (Tvrtko) - Add RCU_MODE to the GuC save/restore list. (Daniele) v3: - Move this patch before the GuC ADS update to enable compute engines; the definition of RCU_MODE and its insertion into the save/restore list moves to this patch. (Daniele) v4: - Call xehp_enable_ccs_engines() directly in guc_resume() and execlists_resume() rather than adding an extra layer of wrapping to the engine->resume() vfunc. (Umesh) Bspec: 46034 Original-author: Michel Thierry Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220302001554.1836066-1-matthew.d.roper@intel.com
|
#
adfadb56 |
|
01-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: Define context scheduling attributes in lrc descriptor In Dual Context mode the EUs are shared between render and compute command streamers. The hardware provides a field in the lrc descriptor to indicate the prioritization of the thread dispatch associated to the corresponding context. The context priority is set to 'low' at creation time and relies on the existing context priority to set it to low/normal/high. Bspec: 46145, 46260 Original-author: Michel Thierry Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Prasad Nallani <prasad.nallani@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-8-matthew.d.roper@intel.com
|
#
c674c5b9 |
|
01-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: CCS should use RCS setup functions The compute engine handles the same commands the render engine can (except 3D pipeline), so it makes sense that CCS is more similar to RCS than non-render engines. The CCS context state (lrc) is also similar to the render one, so reuse it. Note that the compute engine has its own CTX_R_PWR_CLK_STATE register. In order to avoid having multiple RCS && CCS checks, add the following engine flag: - I915_ENGINE_HAS_RCS_REG_STATE - use the render (larger) reg state ctx. BSpec: 46260 Original-author: Michel Thierry Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-6-matthew.d.roper@intel.com
|
#
df62ae6f |
|
09-Feb-2022 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915: move intel_hws_csb_write_index() out of i915_drv.h Underscore prefix the index macros, and place INTEL_HWS_CSB_WRITE_INDEX() as a macro next to them, to declutter i915_drv.h. v2: Don't underscore the index macros (Tvrtko) Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220209131143.3365230-1-jani.nikula@intel.com
|
#
0d6419e9 |
|
27-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Move GT registers to their own header file This is a huge, chaotic mass of registers copied over as-is without any real cleanup. We'll come back and organize these better, align on consistent coding style, remove dead code, etc. in separate patches later that will be easier to review. v2: - Add missing include in intel_pxp_irq.c v3: - Correct a few indentation errors (Lucas) - Minor conflict resolution Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com
|
#
202b1f4c |
|
10-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/gt: Move engine registers to their own header Let's continue breaking up and cleaning up the massive i915_reg.h file by moving all registers that are defined in relation to an engine base to their own header. There are probably a bunch of other "engine registers" that we haven't moved yet (especially those that belong to the render engine in the 0x2??? range), but this is a relatively straightforward first step. Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com
|
#
a88afcfa |
|
22-Dec-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/execlists: Weak parallel submission support for execlists A weak implementation of parallel submission (multi-bb execbuf IOCTL) for execlists. Doing as little as possible to support this interface for execlists - basically just passing submit fences between each request generated and virtual engines are not allowed. This is on par with what is there for the existing (hopefully soon deprecated) bonding interface. We perma-pin these execlists contexts to align with GuC implementation. v2: (John Harrison) - Drop siblings array as num_siblings must be 1 v3: (John Harrison) - Drop single submission v4: (John Harrison) - Actually drop single submission - Use IS_ERR check on return value from intel_context_create - Set last request to NULL on unpin Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211222223532.28698-1-matthew.brost@intel.com
|
#
8b91cdd4 |
|
08-Nov-2021 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
drm/i915: Use __GFP_KSWAPD_RECLAIM in the capture code The capture code is typically run entirely in the fence signalling critical path. We're about to add lockdep annotation in an upcoming patch which reveals a lockdep splat similar to the below one. Fix the associated potential deadlocks using __GFP_KSWAPD_RECLAIM (which is the same as GFP_WAIT, but open-coded for clarity) rather than GFP_KERNEL for memory allocation in the capture path. This has the potential drawback that capture might fail in situations with memory pressure. [ 234.842048] WARNING: possible circular locking dependency detected [ 234.842050] 5.15.0-rc7+ #20 Tainted: G U W [ 234.842052] ------------------------------------------------------ [ 234.842054] gem_exec_captur/1180 is trying to acquire lock: [ 234.842056] ffffffffa3e51c00 (fs_reclaim){+.+.}-{0:0}, at: __kmalloc+0x4d/0x330 [ 234.842063] but task is already holding lock: [ 234.842064] ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915] [ 234.842138] which lock already depends on the new lock. [ 234.842140] the existing dependency chain (in reverse order) is: [ 234.842142] -> #2 (dma_fence_map){++++}-{0:0}: [ 234.842145] __dma_fence_might_wait+0x41/0xa0 [ 234.842149] dma_resv_lockdep+0x1dc/0x28f [ 234.842151] do_one_initcall+0x58/0x2d0 [ 234.842154] kernel_init_freeable+0x273/0x2bf [ 234.842157] kernel_init+0x16/0x120 [ 234.842160] ret_from_fork+0x1f/0x30 [ 234.842163] -> #1 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}: [ 234.842166] fs_reclaim_acquire+0x6d/0xd0 [ 234.842168] __kmalloc_node+0x51/0x3a0 [ 234.842171] alloc_cpumask_var_node+0x1b/0x30 [ 234.842174] native_smp_prepare_cpus+0xc7/0x292 [ 234.842177] kernel_init_freeable+0x160/0x2bf [ 234.842179] kernel_init+0x16/0x120 [ 234.842181] ret_from_fork+0x1f/0x30 [ 234.842184] -> #0 (fs_reclaim){+.+.}-{0:0}: [ 234.842186] __lock_acquire+0x1161/0x1dc0 [ 234.842189] lock_acquire+0xb5/0x2b0 [ 234.842192] fs_reclaim_acquire+0xa1/0xd0 [ 234.842193] __kmalloc+0x4d/0x330 [ 234.842196] i915_vma_coredump_create+0x78/0x5b0 [i915] [ 234.842253] intel_engine_coredump_add_vma+0x36/0xe0 [i915] [ 234.842307] __i915_gpu_coredump+0x290/0x5e0 [i915] [ 234.842365] i915_capture_error_state+0x57/0xa0 [i915] [ 234.842415] intel_gt_handle_error+0x348/0x3e0 [i915] [ 234.842462] intel_gt_debugfs_reset_store+0x3c/0x90 [i915] [ 234.842504] simple_attr_write+0xc1/0xe0 [ 234.842507] full_proxy_write+0x53/0x80 [ 234.842509] vfs_write+0xbc/0x350 [ 234.842513] ksys_write+0x58/0xd0 [ 234.842514] do_syscall_64+0x38/0x90 [ 234.842516] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 234.842519] other info that might help us debug this: [ 234.842521] Chain exists of: fs_reclaim --> mmu_notifier_invalidate_range_start --> dma_fence_map [ 234.842526] Possible unsafe locking scenario: [ 234.842528] CPU0 CPU1 [ 234.842529] ---- ---- [ 234.842531] lock(dma_fence_map); [ 234.842532] lock(mmu_notifier_invalidate_range_start); [ 234.842535] lock(dma_fence_map); [ 234.842537] lock(fs_reclaim); [ 234.842539] *** DEADLOCK *** [ 234.842540] 4 locks held by gem_exec_captur/1180: [ 234.842543] #0: ffff9007812d9460 (sb_writers#17){.+.+}-{0:0}, at: ksys_write+0x58/0xd0 [ 234.842547] #1: ffff900781d9ecb8 (&attr->mutex){+.+.}-{3:3}, at: simple_attr_write+0x3a/0xe0 [ 234.842552] #2: ffffffffc11913a8 (capture_mutex){+.+.}-{3:3}, at: i915_capture_error_state+0x1a/0xa0 [i915] [ 234.842602] #3: ffffffffa3f57620 (dma_fence_map){++++}-{0:0}, at: i915_vma_snapshot_resource_pin+0x27/0x30 [i915] [ 234.842656] stack backtrace: [ 234.842658] CPU: 0 PID: 1180 Comm: gem_exec_captur Tainted: G U W 5.15.0-rc7+ #20 [ 234.842661] Hardware name: ASUS System Product Name/PRIME B560M-A AC, BIOS 0403 01/26/2021 [ 234.842664] Call Trace: [ 234.842666] dump_stack_lvl+0x57/0x72 [ 234.842669] check_noncircular+0xde/0x100 [ 234.842672] ? __lock_acquire+0x3bf/0x1dc0 [ 234.842675] __lock_acquire+0x1161/0x1dc0 [ 234.842678] lock_acquire+0xb5/0x2b0 [ 234.842680] ? __kmalloc+0x4d/0x330 [ 234.842683] ? finish_task_switch.isra.0+0xf2/0x360 [ 234.842686] ? i915_vma_coredump_create+0x78/0x5b0 [i915] [ 234.842734] fs_reclaim_acquire+0xa1/0xd0 [ 234.842737] ? __kmalloc+0x4d/0x330 [ 234.842739] __kmalloc+0x4d/0x330 [ 234.842742] i915_vma_coredump_create+0x78/0x5b0 [i915] [ 234.842793] ? capture_vma+0xbe/0x110 [i915] [ 234.842844] intel_engine_coredump_add_vma+0x36/0xe0 [i915] [ 234.842892] __i915_gpu_coredump+0x290/0x5e0 [i915] [ 234.842939] i915_capture_error_state+0x57/0xa0 [i915] [ 234.842985] intel_gt_handle_error+0x348/0x3e0 [i915] [ 234.843032] ? __mutex_lock+0x81/0x830 [ 234.843035] ? simple_attr_write+0x3a/0xe0 [ 234.843038] ? __lock_acquire+0x3bf/0x1dc0 [ 234.843041] intel_gt_debugfs_reset_store+0x3c/0x90 [i915] [ 234.843083] ? _copy_from_user+0x45/0x80 [ 234.843086] simple_attr_write+0xc1/0xe0 [ 234.843089] full_proxy_write+0x53/0x80 [ 234.843091] vfs_write+0xbc/0x350 [ 234.843094] ksys_write+0x58/0xd0 [ 234.843096] do_syscall_64+0x38/0x90 [ 234.843098] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 234.843101] RIP: 0033:0x7fa467480877 [ 234.843103] Code: 75 05 48 83 c4 58 c3 e8 37 4e ff ff 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ 234.843108] RSP: 002b:00007ffd14d79b08 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 234.843112] RAX: ffffffffffffffda RBX: 00007ffd14d79b60 RCX: 00007fa467480877 [ 234.843114] RDX: 0000000000000014 RSI: 00007ffd14d79b60 RDI: 0000000000000007 [ 234.843116] RBP: 0000000000000007 R08: 0000000000000000 R09: 00007ffd14d79ab0 [ 234.843119] R10: ffffffffffffffff R11: 0000000000000246 R12: 0000000000000014 [ 234.843121] R13: 0000000000000000 R14: 00007ffd14d79b60 R15: 0000000000000005 v5: - Use __GFP_KSWAPD_RECLAIM rather than __GFP_NOWAIT for clarity. (Daniel Vetter) v6: - Include an instance in execlists_capture_work(). - Rework the commit message due to patch reordering. Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Ramalingam C <ramalingam.c@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211108174547.979714-3-thomas.hellstrom@linux.intel.com
|
#
77cdd054 |
|
26-Oct-2021 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
drm/i915/pmu: Connect engine busyness stats from GuC to pmu With GuC handling scheduling, i915 is not aware of the time that a context is scheduled in and out of the engine. Since i915 pmu relies on this info to provide engine busyness to the user, GuC shares this info with i915 for all engines using shared memory. For each engine, this info contains: - total busyness: total time that the context was running (total) - id: id of the running context (id) - start timestamp: timestamp when the context started running (start) At the time (now) of sampling the engine busyness, if the id is valid (!= ~0), and start is non-zero, then the context is considered to be active and the engine busyness is calculated using the below equation engine busyness = total + (now - start) All times are obtained from the gt clock base. For inactive contexts, engine busyness is just equal to the total. The start and total values provided by GuC are 32 bits and wrap around in a few minutes. Since perf pmu provides busyness as 64 bit monotonically increasing values, there is a need for this implementation to account for overflows and extend the time to 64 bits before returning busyness to the user. In order to do that, a worker runs periodically at frequency = 1/8th the time it takes for the timestamp to wrap. As an example, that would be once in 27 seconds for a gt clock frequency of 19.2 MHz. Note: There might be an over-accounting of busyness due to the fact that GuC may be updating the total and start values while kmd is reading them. (i.e kmd may read the updated total and the stale start). In such a case, user may see higher busyness value followed by smaller ones which would eventually catch up to the higher value. v2: (Tvrtko) - Include details in commit message - Move intel engine busyness function into execlist code - Use union inside engine->stats - Use natural type for ping delay jiffies - Drop active_work condition checks - Use for_each_engine if iterating all engines - Drop seq locking, use spinlock at GuC level to update engine stats - Document worker specific details v3: (Tvrtko/Umesh) - Demarcate GuC and execlist stat objects with comments - Document known over-accounting issue in commit - Provide a consistent view of GuC state - Add hooks to gt park/unpark for GuC busyness - Stop/start worker in gt park/unpark path - Drop inline - Move spinlock and worker inits to GuC initialization - Drop helpers that are called only once v4: (Tvrtko/Matt/Umesh) - Drop addressed opens from commit message - Get runtime pm in ping, remove from the park path - Use cancel_delayed_work_sync in disable_submission path - Update stats during reset prepare - Skip ping if reset in progress - Explicitly name execlists and GuC stats objects - Since disable_submission is called from many places, move resetting stats to intel_guc_submission_reset_prepare v5: (Tvrtko) - Add a trylock helper that does not sleep and synchronize PMU event callbacks and worker with gt reset v6: (CI BAT failures) - DUTs using execlist submission failed to boot since __gt_unpark is called during i915 load. This ends up calling the GuC busyness unpark hook and results in kick-starting an uninitialized worker. Let park/unpark hooks check if GuC submission has been initialized. - drop cant_sleep() from trylock helper since rcu_read_lock takes care of that. v7: (CI) Fix igt@i915_selftest@live@gt_engines - For GuC mode of submission the engine busyness is derived from gt time domain. Use gt time elapsed as reference in the selftest. - Increase busyness calculation to 10ms duration to ensure batch runs longer and falls within the busyness tolerances in selftest. v8: - Use ktime_get in selftest as before - intel_reset_trylock_no_wait results in a lockdep splat that is not trivial to fix since the PMU callback runs in irq context and the reset paths are tightly knit into the driver. The test that uncovers this is igt@perf_pmu@faulting-read. Drop intel_reset_trylock_no_wait, instead use the reset_count to synchronize with gt reset during pmu callback. For the ping, continue to use intel_reset_trylock since ping is not run in irq context. - GuC PM timestamp does not tick when GuC is idle. This can potentially result in wrong busyness values when a context is active on the engine, but GuC is idle. Use the RING TIMESTAMP as GPU timestamp to process the GuC busyness stats. This works since both GuC timestamp and RING timestamp are synced with the same clock. - The busyness stats may get updated after the batch starts running. This delay causes the busyness reported for 100us duration to fall below 95% in the selftest. The only option at this time is to wait for GuC busyness to change from idle to active before we sample busyness over a 100us period. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-2-umesh.nerlige.ramappa@intel.com
|
#
e5e32171 |
|
14-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Connect UAPI to GuC multi-lrc interface Introduce 'set parallel submit' extension to connect UAPI to GuC multi-lrc interface. Kernel doc in new uAPI should explain it all. IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1 media UMD: https://github.com/intel/media-driver/pull/1252 v2: (Daniel Vetter) - Add IGT link and placeholder for media UMD link v3: (Kernel test robot) - Fix warning in unpin engines call (John Harrison) - Reword a bunch of the kernel doc v4: (John Harrison) - Add comment why perma-pin is done after setting gem context - Update some comments / docs for proto contexts v5: (John Harrison) - Rework perma-pin comment - Add BUG_IN if context is pinned when setting gem context Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-17-matthew.brost@intel.com
|
#
4f3059dc |
|
14-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Add logical engine mapping Add logical engine mapping. This is required for split-frame, as workloads need to be placed on engines in a logically contiguous manner. v2: (Daniel Vetter) - Add kernel doc for new fields v3: (Tvrtko) - Update comment for new logical_mask field v4: (John Harrison) - Update comment for new logical_mask field Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-6-matthew.brost@intel.com
|
#
1a839e01 |
|
05-Oct-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: remove IS_ACTIVE When trying to bring IS_ACTIVE to linux/kconfig.h I thought it wouldn't provide much value just encapsulating it in a boolean context. So I also added the support for handling undefined macros as the IS_ENABLED() counterpart. However the feedback received from Masahiro Yamada was that it is too ugly, not providing much value. And just wrapping in a boolean context is too dumb - we could simply open code it. As detailed in commit babaab2f4738 ("drm/i915: Encapsulate kconfig constant values inside boolean predicates"), the IS_ACTIVE macro was added to workaround a compilation warning. However after checking again our current uses of IS_ACTIVE it turned out there is only 1 case in which it triggers a warning in clang (due -Wconstant-logical-operand) and 2 in smatch. All the others can simply use the shorter version, without wrapping it in any macro. So here I'm dialing all the way back to simply removing the macro. That single case hit by clang can be changed to make the constant come first, so it doesn't think it's mask: - if (context && CONFIG_DRM_I915_FENCE_TIMEOUT) + if (CONFIG_DRM_I915_FENCE_TIMEOUT && context) As talked with Dan Carpenter, that logic will be added in smatch as well, so it will also stop warning about it. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211005171728.3147094-1-lucas.demarchi@intel.com
|
#
3e42cc61 |
|
22-Sep-2021 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
drm/i915/gt: Register the migrate contexts with their engines Pinned contexts, like the migrate contexts need reset after resume since their context image may have been lost. Also the GuC needs to register pinned contexts. Add a list to struct intel_engine_cs where we add all pinned contexts on creation, and traverse that list at resume time to reset the pinned contexts. This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now, but proper LMEM backup / restore is needed for full suspend functionality. However, note that even with full LMEM backup / restore it may be desirable to keep the reset since backing up the migrate context images must happen using memcpy() after the migrate context has become inactive, and for performance- and other reasons we want to avoid memcpy() from LMEM. Also traverse the list at guc_init_lrc_mapping() calling guc_kernel_context_pin() for the pinned contexts, like is already done for the kernel context. v2: - Don't reset the contexts on each __engine_unpark() but rather at resume time (Chris Wilson). v3: - Reset contexts in the engine sanitize callback. (Chris Wilson) Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Brost Matthew <matthew.brost@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-6-thomas.hellstrom@linux.intel.com
|
#
ac653dd7 |
|
09-Sep-2021 |
Matthew Brost <matthew.brost@intel.com> |
Revert "drm/i915/gt: Propagate change in error status to children on unhold" Propagating errors to dependent fences is broken and can lead to errors from one client ending up in another. In commit 3761baae908a ("Revert "drm/i915: Propagate errors on awaiting already signaled fences""), we attempted to get rid of fence error propagation but missed the case added in commit 8e9f84cf5cac ("drm/i915/gt: Propagate change in error status to children on unhold"). Revert that one too. This error was found by an up-and-coming selftest which triggers a reset during request cancellation and verifies that subsequent requests complete successfully. v2: (Daniel Vetter) - Use revert v3: (Jason) - Update commit message v4 (Daniele): - fix checkpatch error in commit message. References: '3761baae908a ("Revert "drm/i915: Propagate errors on awaiting already signaled fences"")' Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210909164744.31249-8-matthew.brost@intel.com
|
#
058d7d62 |
|
02-Sep-2021 |
Colin Ian King <colin.king@canonical.com> |
drm/i915: clean up inconsistent indenting There is a statement that is indented one character too deeply, clean this up. Signed-off-by: Colin Ian King <colin.king@canonical.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210902215737.55570-1-colin.king@canonical.com
|
#
62eaf0ae |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Support request cancellation This adds GuC backend support for i915_request_cancel(), which in turn makes CONFIG_DRM_I915_REQUEST_TIMEOUT work. This implementation makes use of fence while there are likely simplier options. A fence was chosen because of another feature coming soon which requires a user to block on a context until scheduling is disabled. In that case we return the fence to the user and the user can wait on that fence. v2: (Daniele) - A comment about locking the blocked incr / decr - A comments about the use of the fence - Update commit message explaining why fence - Delete redundant check blocked count in unblock function - Ring buffer implementation - Comment about blocked in submission path - Shorter rpm path v3: (Checkpatch) - Fix typos in commit message (Daniel) - Rework to simplier locking structure in guc_context_block / unblock Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-26-matthew.brost@intel.com
|
#
d1cee2d3 |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move active request tracking to a vfunc Move active request tracking to a backend vfunc rather than assuming all backends want to do this in the manner. In the of case execlists / ring submission the tracking is on the physical engine while with GuC submission it is on the context. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-8-matthew.brost@intel.com
|
#
a95d1160 |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbs With GuC virtual engines the physical engine which a request executes and completes on isn't known to the i915. Therefore we can't attach a request to a physical engines breadcrumbs. To work around this we create a single breadcrumbs per engine class when using GuC submission and direct all physical engine interrupts to this breadcrumbs. v2: (John H) - Rework header file structure so intel_engine_mask_t can be in intel_engine_types.h Signed-off-by: Matthew Brost <matthew.brost@intel.com> CC: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-6-matthew.brost@intel.com
|
#
55612025 |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: GuC virtual engines Implement GuC virtual engines. Rather simple implementation, basically just allocate an engine, setup context enter / exit function to virtual engine specific functions, set all other variables / functions to guc versions, and set the engine mask to that of all the siblings. v2: Update to work with proto-ctx v3: (Daniele) - Drop include, add comment to intel_virtual_engine_has_heartbeat Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-2-matthew.brost@intel.com
|
#
bfac1e2b |
|
23-Jul-2021 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: Xe_HP forcewake support Implement Xe_HP forcewake handling. While we're at it, let's reorder to the forcewake assignment if/else ladder to match our usual driver conventions. Co-authored-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210723174239.1551352-6-matthew.d.roper@intel.com
|
#
50a9ea08 |
|
21-Jul-2021 |
Stuart Summers <stuart.summers@intel.com> |
drm/i915/xehp: Handle new device context ID format Xe_HP changes the format of the context ID from past platforms. Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210721223043.834562-9-matthew.d.roper@intel.com
|
#
dd4f1bba |
|
08-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2) Even though FENCE_SUBMIT is only documented to wait until the request in the in-fence starts instead of waiting until it completes, it has a bit more magic than that. If FENCE_SUBMIT is used to submit something to a balanced engine, we would wait to assign engines until the primary request was ready to start and then attempt to assign it to a different engine than the primary. There is an IGT test (the bonded-slice subtest of gem_exec_balancer) which exercises this by submitting a primary batch to a specific VCS and then using FENCE_SUBMIT to submit a secondary which can run on any VCS and have i915 figure out which VCS to run it on such that they can run in parallel. However, this functionality has never been used in the real world. The media driver (the only user of FENCE_SUBMIT) always picks exactly two physical engines to bond and never asks us to pick which to use. v2 (Daniel Vetter): - Mention the exact IGT test this breaks Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-11-jason@jlekstrand.net
|
#
521695c6 |
|
08-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
drm/i915/gem: Disallow bonding of virtual engines (v3) This adds a bunch of complexity which the media driver has never actually used. The media driver does technically bond a balanced engine to another engine but the balanced engine only has one engine in the sibling set. This doesn't actually result in a virtual engine. This functionality was originally added to handle cases where we may have more than two video engines and media might want to load-balance their bonded submits by, for instance, submitting to a balanced vcs0-1 as the primary and then vcs2-3 as the secondary. However, no such hardware has shipped thus far and, if we ever want to enable such use-cases in the future, we'll use the up-and-coming parallel submit API which targets GuC submission. This makes I915_CONTEXT_ENGINES_EXT_BOND a total no-op. We leave the validation code in place in case we ever decide we want to do something interesting with the bonding information. v2 (Jason Ekstrand): - Don't delete quite as much code. v3 (Tvrtko Ursulin): - Add some history to the commit message Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-10-jason@jlekstrand.net
|
#
4a766ae4 |
|
08-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
drm/i915: Drop the CONTEXT_CLONE API (v2) This API allows one context to grab bits out of another context upon creation. It can be used as a short-cut for setparam(getparam()) for things like I915_CONTEXT_PARAM_VM. However, it's never been used by any real userspace. It's used by a few IGT tests and that's it. Since it doesn't add any real value (most of the stuff you can CLONE you can copy in other ways), drop it. There is one thing that this API allows you to clone which you cannot clone via getparam/setparam: timelines. However, timelines are an implementation detail of i915 and not really something that needs to be exposed to userspace. Also, sharing timelines between contexts isn't obviously useful and supporting it has the potential to complicate i915 internally. It also doesn't add any functionality that the client can't get in other ways. If a client really wants a shared timeline, they can use a syncobj and set it as an in and out fence on every submit. v2 (Jason Ekstrand): - More detailed commit message Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-7-jason@jlekstrand.net
|
#
22916bad |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move submission tasklet to i915_sched_engine The submission tasklet operates on i915_sched_engine, thus it is the correct place for it. v3: (Jason Ekstrand) Change sched_engine->engine to a void* private data pointer Add kernel doc v4: (Daniele) Update private_data comment Set queue_priority_hint in kick_execlists v5: (CI) Rebase and fix build error Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-9-matthew.brost@intel.com
|
#
d2a31d02 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Update i915_scheduler to operate on i915_sched_engine Rather passing around an intel_engine_cs in the scheduling code, pass around a i915_sched_engine. v3: (Jason Ekstrand) Add READ_ONCE around rq->engine in lock_sched_engine Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-8-matthew.brost@intel.com
|
#
71ed6011 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Add kick_backend function to i915_sched_engine Not all back-ends require a kick after a scheduling update, so make the kick a call-back function that the back-end can opt-in to. Also move the current kick function from the scheduler to the execlists file as it is specific to that back-end. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-7-matthew.brost@intel.com
|
#
3f623e06 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move engine->schedule to i915_sched_engine The schedule function should be in the schedule object. v3: (Jason Ekstrand) Add kernel doc Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-6-matthew.brost@intel.com
|
#
349a2bc5 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move active tracking to i915_sched_engine Move active request tracking and its lock to i915_sched_engine. This lock is also the submission lock so having it in the i915_sched_engine is the correct place. v3: (Jason Ekstrand) Add kernel doc v6: Rebase Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.comk> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-5-matthew.brost@intel.com
|
#
c4fd7d8c |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Reset sched_engine.no_priolist immediately after dequeue Rather than touching schedule state in the generic PM code, reset the priolist allocation when empty in the submission code. Add a wrapper function to do this and update the backends to call it in the correct place. v3: (Jason Ekstrand) Update patch commit message with a better description Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-4-matthew.brost@intel.com
|
#
074bb195 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Add i915_sched_engine_is_empty function Add wrapper function around RB tree to determine if i915_sched_engine is empty. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-3-matthew.brost@intel.com
|
#
3e28d371 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move priolist to new i915_sched_engine object Introduce i915_sched_engine object which is lower level data structure that i915_scheduler / generic code can operate on without touching execlist specific structures. This allows additional submission backends to be added without breaking the layering. Currently the execlists backend uses 1 of these object per each engine (physical or virtual) but future backends like the GuC will point to less instances utilizing the reference counting. This is a bit of detour to integrating the i915 with the DRM scheduler but this object will still exist when the DRM scheduler lands in the i915. It will however look a bit different. It will encapsulate the drm_gpu_scheduler object plus and common variables (to the backends) related to scheduling. Regardless this is a step in the right direction. This patch starts the aforementioned transition by moving the priolist into the i915_sched_engine object. v3: (Jason Ekstrand) Update comment next to intel_engine_cs.virtual Add kernel doc (Checkpatch) Fix double the in commit message v4: (Daniele) Update comment message. Add comment about subclass field Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-2-matthew.brost@intel.com
|
#
c816723b |
|
05-Jun-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com
|
#
0669a6e1 |
|
21-May-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move CS interrupt handler to the backend The different submission backends each have their own preferred behaviour and interrupt setup. Let each handle their own interrupts. This becomes more useful later as we to extract the use of auxiliary state in the interrupt handler that is backend specific. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210521183215.65451-4-matthew.brost@intel.com
|
#
c92c36ed |
|
21-May-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move submission_method into intel_gt Since we setup the submission method for the engines once, it is easy to assign an enum and use that instead of probing into the backends. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210521183215.65451-3-matthew.brost@intel.com
|
#
0db3633f |
|
21-May-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move engine setup out of set_default_submission Now that we no longer switch back and forth between guc and execlists, we no longer need to restore the backend's vfunc and can leave them set after initialisation. The only catch is that we lose the submission on wedging and still need to reset the submit_request vfunc on unwedging. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210521183215.65451-2-matthew.brost@intel.com
|
#
90a79a91 |
|
23-Mar-2021 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Handle async cancellation in sentinel assert With the watchdog cancelling requests asynchronously to preempt-to-busy we need to relax one assert making it apply only to requests not in error. v2: * Check against the correct request! v3: * Simplify the check to avoid the question of when to sample the fence error vs sentinel bit. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324121335.2307063-5-tvrtko.ursulin@linux.intel.com
|
#
8f922e42 |
|
23-Mar-2021 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Restrict sentinel requests further Disallow sentinel requests follow previous sentinels to make request cancellation work better when faced with a chain of requests which have all been marked as in error. Because in cases where we end up with a stream of cancelled requests we want to turn off request coalescing so they each will get individually skipped by the execlists_schedule_in (which is called per ELSP port, not per request). Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> [danvet: Fix typo in the commit message that Matthew spotted.] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324121335.2307063-4-tvrtko.ursulin@linux.intel.com
|
#
38b237ea |
|
23-Mar-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Individual request cancellation Currently, we cancel outstanding requests within a context when the context is closed. We may also want to cancel individual requests using the same graceful preemption mechanism. v2 (Tvrtko): * Cancel waiters carefully considering no timeline lock and RCU. * Fixed selftests. v3 (Tvrtko): * Remove error propagation to waiters for now. v4 (Tvrtko): * Rebase for extracted i915_request_active_engine. (Matt) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> [danvet: Resolve conflict because intel_engine_flush_scheduler is still called intel_engine_flush_submission] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210324121335.2307063-3-tvrtko.ursulin@linux.intel.com
|
#
c9a995e5 |
|
01-Feb-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Retire unexpected starting state error dumping We have not seen an occurrence of the false restart state recenty, and if we did see such an event from inside engine-reset, it would deadlock on trying to suspend the tasklet to read the register state (from inside the tasklet). Instead, we inspect the context state before submission which will alert us to any issues prior to execution on HW. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210201164222.14455-1-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
c10e4a79 |
|
01-Feb-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Protect against request freeing during cancellation on wedging As soon as we mark a request as completed, it may be retired. So when cancelling a request and marking it complete, make sure we first keep a reference to the request. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210201085715.27435-4-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
2913fa4d |
|
26-Jan-2021 |
Emil Renner Berthing <kernel@esmil.dk> |
drm/i915/gt: use new tasklet API for execution list This converts the driver to use the new tasklet API introduced in commit 12cc923f1ccc ("tasklet: Introduce new initialization API") v2: Fix up selftests/execlists. Signed-off-by: Emil Renner Berthing <kernel@esmil.dk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210126150155.1617-1-kernel@esmil.dk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
b3f0c15a |
|
25-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move the defer_request waiter active assertion In defer_request() we start with the request we just unsubmitted (that should be the active request on the gpu) and then defer all of its waiters. No waiter should be ahead of the active request, so none should be marked as active. That assert failed. Of particular note this machine was undergoing persistent GPU resets due to underlying HW issues, so that may be a clue. A request is also marked as active when it is retired, regardless of current queue status, and so this assertion failure may be a result of the queue being completed by the reset and then subsequently processed by the tasklet. We can filter out retired requests here by doing the assertion check after the is-ready check (active is a subset of being ready). Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2978 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210125140136.10494-2-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
7898843c |
|
22-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Fixup misaligned function parameters Remember to align parameters to the '(', thanks checkpatch Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-4-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
1ca9b8da |
|
22-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Remove repeated words from comments Checkpatch spotted a few repeated words in the comment, genuine mistakes. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-3-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
2867ff6c |
|
19-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Strip out internal priorities Since we are not using any internal priority levels, and in the next few patches will introduce a new index for which the optimisation is not so lear cut, discard the small table within the priolist. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210120121439.17600-1-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
9c01524d |
|
23-Mar-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Populate logical context during first pin. This allows us to remove pin_map from state allocation, which saves us a few retry loops. We won't need this until first pin, anyway. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> [danvet: Resolve context conflict because we don't have the i915_scheduler.c extraction from the below patches set: https://lore.kernel.org/intel-gfx/20210203165259.13087-6-chris@chris-wilson.co.uk/] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-19-maarten.lankhorst@linux.intel.com
|
#
a2dd2ff5 |
|
19-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Skip over completed active execlists, again Now that we are careful to always force-restore contexts upon rewinding (where necessary), we can restore our optimisation to skip over completed active execlists when dequeuing. Referenecs: 35f3fd8182ba ("drm/i915/execlists: Workaround switching back to a completed context") References: 8ab3a3812aa9 ("drm/i915/gt: Incrementally check for rewinding") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210120121718.26435-2-chris@chris-wilson.co.uk
|
#
aba73826 |
|
19-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Do not suspend bonded requests if one hangs Treat the dependency between bonded requests as weak and leave the remainder of the pair on the GPU if one hangs. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210120121718.26435-1-chris@chris-wilson.co.uk
|
#
4fb05a39 |
|
15-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Extract busy-stats for ring-scheduler Lift the busy-stats context-in/out implementation out of intel_lrc, so that we can reuse it for other scheduler implementations. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-2-chris@chris-wilson.co.uk
|
#
2c421896 |
|
15-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Drop atomic for engine->fw_active tracking Since schedule-in/out is now entirely serialised by the tasklet bitlock, we do not need to worry about concurrent in/out operations and so reduce the atomic operations to plain instructions. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-1-chris@chris-wilson.co.uk
|
#
163433e5 |
|
14-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Mark up protected uses of 'i915_request_completed' When we know that we are inside the timeline mutex, or inside the submission flow (under active.lock or the holder's rcu lock), we know that the rq->hwsp is stable and we can use the simpler direct version. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-1-chris@chris-wilson.co.uk
|
#
e7326336 |
|
13-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Prune 'inline' from execlists Remove the extraneous inlines. The only split by the compiler that looked dubious was execlists_schedule_out, so push the code around slightly to move all the work into the out-of-line function. In a normal build, bloat-o-meter shows that only the execlists_schedule_out is contentious: add/remove: 1/0 grow/shrink: 0/2 up/down: 803/-1532 (-729) Function old new delta __execlists_schedule_out - 803 +803 execlists_submission_tasklet 6488 5766 -722 execlists_reset_csb.constprop 1587 777 -810 Total: Before=1605815, After=1605086, chg -0.05% Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210113151112.15212-1-chris@chris-wilson.co.uk
|
#
007c4578 |
|
12-Jan-2021 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915/guc: stop calling execlists_set_default_submission Initialize all required entries from guc_set_default_submission, instead of calling the execlists function. The previously inherited setup has been copied over from the execlist code and simplified by removing the execlists submission-specific parts. v2: move setting of relative_mmio flag to engine_setup_common (Chris) Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <john.c.harrison@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v1 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-5-daniele.ceraolospurio@intel.com
|
#
0da3f250 |
|
11-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Disable arbitration around Braswell's pdp updates Braswell's pdp workaround is full of dragons, that may be being angered when they are interrupted. Let's not take that risk and disable arbitration during the update. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210111105735.21515-1-chris@chris-wilson.co.uk
|
#
baa7c2cd |
|
09-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Refactor marking a request as EIO When wedging the device, we cancel all outstanding requests and mark them as EIO. Rather than duplicate the small function to do so between each submission backend, export one. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210109163455.28466-3-chris@chris-wilson.co.uk
|
#
9a437ccb |
|
09-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Exercise lrc_wa_ctx initialisation failure Inject a fault into lrc_init_wa_ctx() to ensure that we can tolerate a failure to construct the workarounds. v2: Avoid mentioning an error for fault-injection, other CI will complain about the dmesg spam. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210109114453.27798-1-chris@chris-wilson.co.uk
|
#
751f82b3 |
|
08-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Only disable preemption on gen8 render engines The reason why we did not enable preemption on Broadwater was due to missing GPGPU workarounds. Since this only applies to rcs0, only restrict rcs0 (and our global capabilities). While this does not affect exposing a preemption capability to userspace, it does affect our internal decisions on whether to use timeslicing and semaphores between individual engines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-6-chris@chris-wilson.co.uk
|
#
b1ad5f6d |
|
08-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Only retire on the last breadcrumb if the last request We use the completion of the last active breadcrumb to retire the requests along a timeline. This is purely opportunistic as nothing guarantees that any particular timeline is terminated by a breadcrumb; except for parking the engine where we explicitly add a breadcrumb so that we park quickly and do an explicit retire upon signaling to reduce the latency dramatically (avoiding a retire worker roundtrip). With scheduling, we anticipate retiring completed timelines as a matter of course. Performing the same action from inside the breadcrumbs is intended to provide similar functionality for legacy ringbuffer submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-5-chris@chris-wilson.co.uk
|
#
2b2985a4 |
|
08-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Restore ce->signal flush before releasing virtual engine Before we mark the virtual engine as no longer inflight, flush any ongoing signaling that may be using the ce->signal_link along the previous breadcrumbs. On switch to a new physical engine, that link will be inserted into the new set of breadcrumbs, causing confusion to an ongoing iterator. This patch undoes a last minute mistake introduced into commit bab0557c8dca ("drm/i915/gt: Remove virtual breadcrumb before transfer"), whereby instead of unconditionally applying the flush, it was only applied if the request itself was going to be reused. v2: Generalise and cancel all remaining ce->signals Fixes: bab0557c8dca ("drm/i915/gt: Remove virtual breadcrumb before transfer") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210108204026.20682-4-chris@chris-wilson.co.uk
|
#
4386b8e5 |
|
07-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Remove timeslice suppression In the next^W future patch, we remove the strict priority system and continuously re-evaluate the relative priority of tasks. As such we need to enable the timeslice whenever there is more than one context in the pipeline. This simplifies the decision and removes some of the tweaks to suppress timeslicing, allowing us to lift the timeslice enabling to a common spot at the end of running the submission tasklet. One consequence of the suppression is that it was reducing fairness between virtual engines on an over saturated system; undermining the principle for timeslicing. v2: Commentary v3: Commentary for the right cancel_timer() v4: Add tracing for why we need a timeslice Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2802 Testcase: igt/gem_exec_balancer/fairslice Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210107132322.28373-1-chris@chris-wilson.co.uk
|
#
0e58de9f |
|
04-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Check the virtual still matches upon locking If another sibling is able to claim the virtual request, by the time we inspect the request under the lock it may no longer match the local engine. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2877 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-4-chris@chris-wilson.co.uk
|
#
0a7d355e |
|
04-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Allow failed resets without assertion If the engine reset fails, we will attempt to resume with the current inflight submissions. When that happens, we cannot assert that the engine reset cleared the pending submission, so do not. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2878 Fixes: 16f2941ad307 ("drm/i915/gt: Replace direct submit with direct call to tasklet") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-3-chris@chris-wilson.co.uk
|
#
093a0bea |
|
31-Dec-2020 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Populate logical context during first pin. This allows us to remove pin_map from state allocation, which saves us a few retry loops. We won't need this until first pin, anyway. Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20201231170405.22843-1-chris@chris-wilson.co.uk
|
#
9c080b0f |
|
31-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Pull context closure check from request submit to schedule-in We only need to evaluate the current status of the context when it is scheduled in, we will force a reschedule when the context is closed propagating the change to inflight contexts. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201231093946.11649-1-chris@chris-wilson.co.uk
|
#
7904e081 |
|
30-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Cancel submitted requests upon context reset Since we process schedule-in of a context after submitting the request, if we decide to reset the context at that time, we also have to cancel the requets we have marked for submission. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201230220028.17089-1-chris@chris-wilson.co.uk
|
#
cc1557ca |
|
29-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gem: Peek at the inflight context If supported by the backend, we can quickly look at the context's inflight engine rather than search along the active list to confirm. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201229144114.31686-1-chris@chris-wilson.co.uk
|
#
177b7a52 |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: ce->inflight updates are now serialised Since schedule-in and schedule-out are now both always under the tasklet bitlock, we can reduce the individual atomic operations to simple instructions and worry less. This notably eliminates the race observed with intel_context_inflight in __engine_unpark(). Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2583 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-9-chris@chris-wilson.co.uk
|
#
ac1a6d73 |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Simplify virtual engine handling for execlists_hold() Now that the tasklet completely controls scheduling of the requests, and we postpone scheduling out the old requests, we can keep a hanging virtual request bound to the engine on which it hung, and remove it from te queue. On release, it will be returned to the same engine and remain in its queue until it is scheduled; after which point it will become eligible for transfer to a sibling. Instead, we could opt to resubmit the request along the virtual engine on unhold, making it eligible for load balancing immediately -- but that seems like a pointless optimisation for a hanging context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-8-chris@chris-wilson.co.uk
|
#
f81475bb |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Resubmit the virtual engine on schedule-out Having recognised that we do not change the sibling until we schedule out, we can then defer the decision to resubmit the virtual engine from the unwind of the active queue to scheduling out of the virtual context. This improves our resilence in virtual engine scheduling, and should eliminate the rare cases of gem_exec_balance failing. By keeping the unwind order intact on the local engine, we can preserve data dependency ordering while doing a preempt-to-busy pass until we have determined the new ELSP. This means that if we try to timeslice between a virtual engine and a data-dependent ordinary request, the pair will maintain their relative ordering and we will avoid the resubmission, cancelling the timeslicing until further change. The dilemma though is that we then may end up in a situation where the 'demotion' of the virtual request to an ordinary request in the engine queue results in filling the ELSP[] with virtual requests instead of spreading the load across the engines. To compensate for this, we mark each virtual request and refuse to resubmit a virtual request in the secondary ELSP slots, thus forcing subsequent virtual requests to be scheduled out after timeslicing. By delaying the decision until we schedule out, we will avoid unnecessary resubmission. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2079 Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2098 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-7-chris@chris-wilson.co.uk
|
#
66e40750 |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Shrink the critical section for irq signaling Let's only wait for the list iterator when decoupling the virtual breadcrumb, as the signaling of all the requests may take a long time, during which we do not want to keep the tasklet spinning. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-6-chris@chris-wilson.co.uk
|
#
bab0557c |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Remove virtual breadcrumb before transfer The issue with stale virtual breadcrumbs remain. Now we have the problem that if the irq-signaler is still referencing the stale breadcrumb as we transfer it to a new sibling, the list becomes spaghetti. This is a very small window, but that doesn't stop it being hit infrequently. To prevent the lists being tangled (the iterator starting on one engine's b->signalers but walking onto another list), always decouple the virtual breadcrumb on schedule-out and make sure that the walker has stepped out of the lists. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-5-chris@chris-wilson.co.uk
|
#
6f0726b4 |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Defer schedule_out until after the next dequeue Inside schedule_out, we do extra work upon idling the context, such as updating the runtime, kicking off retires, kicking virtual engines. However, if we are in a series of processing single requests per contexts, we may find ourselves scheduling out the context, only to immediately schedule it back in during dequeue. This is just extra work that we can avoid if we keep the context marked as inflight across the dequeue. This becomes more significant later on for minimising virtual engine misses. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-4-chris@chris-wilson.co.uk
|
#
2efa2c52 |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Decouple inflight virtual engines Once a virtual engine has been bound to a sibling, it will remain bound until we finally schedule out the last active request. We can not rebind the context to a new sibling while it is inflight as the context save will conflict, hence we wait. As we cannot then use any other sibliing while the context is inflight, only kick the bound sibling while it inflight and upon scheduling out the kick the rest (so that we can swap engines on timeslicing if the previously bound engine becomes oversubscribed). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-3-chris@chris-wilson.co.uk
|
#
64b7a3fa |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Use virtual_engine during execlists_dequeue Rather than going back and forth between the rb_node entry and the virtual_engine type, store the ve local and reuse it. As the container_of conversion from rb_node to virtual_engine requires a variable offset, performing that conversion just once shaves off a bit of code. v2: Keep a single virtual engine lookup, for typical use. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-2-chris@chris-wilson.co.uk
|
#
16f2941a |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Replace direct submit with direct call to tasklet Rather than having special case code for opportunistically calling process_csb() and performing a direct submit while holding the engine spinlock for submitting the request, simply call the tasklet directly. This allows us to retain the direct submission path, including the CS draining to allow fast/immediate submissions, without requiring any duplicated code paths, and most importantly greatly simplifying the control flow by removing reentrancy. This will enable us to close a few races in the virtual engines in the next few patches. The trickiest part here is to ensure that paired operations (such as schedule_in/schedule_out) remain under consistent locking domains, e.g. when pulled outside of the engine->active.lock v2: Use bh kicking, see commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous"). v3: Update engine-reset to be tasklet aware Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
|
#
b436a5f8 |
|
22-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Track all timelines created using the HWSP We assume that the contents of the HWSP are lost across suspend, and so upon resume we must restore critical values such as the timeline seqno. Keep track of every timeline allocated that uses the HWSP as its storage and so we can then reset all seqno values by walking that list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201222104242.10993-1-chris@chris-wilson.co.uk
|
#
a0d3fdb6 |
|
18-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Split logical ring contexts from execlist submission Split the definition, construction and updating of the Logical Ring Context from the execlist submission interface. The LRC is used by the HW, irrespective of our different submission backends. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201219020343.22681-1-chris@chris-wilson.co.uk
|
#
d0d829e5 |
|
09-Dec-2020 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: split gen8+ flush and bb_start emission functions These functions are independent from the backend used and can therefore be split out of the exelists submission file, so they can be re-used by the upcoming GuC submission backend. Based on a patch by Chris Wilson. Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-3-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
|
#
70a2b431 |
|
09-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Rename lrc.c to execlists_submission.c We want to separate the utility functions for controlling the logical ring context from the execlists submission mechanism (which is an overgrown scheduler). This is similar to Daniele's work to split up the files, but being selfish I wanted to base it after my own changes to intel_lrc.c petered out. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-2-chris@chris-wilson.co.uk
|