History log of /linux-master/drivers/gpu/drm/i915/gt/intel_engine_cs.c
Revision Date Author Comments
# ea315f98 28-Mar-2024 Andi Shyti <andi.shyti@linux.intel.com>

drm/i915/gt: Do not generate the command streamer for all the CCS

We want a fixed load CCS balancing consisting in all slices
sharing one single user engine. For this reason do not create the
intel_engine_cs structure with its dedicated command streamer for
CCS slices beyond the first.

Fixes: d2eae8e98d59 ("drm/i915/dg2: Drop force_probe requirement")
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: <stable@vger.kernel.org> # v6.2+
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240328073409.674098-3-andi.shyti@linux.intel.com
(cherry picked from commit c7a5aa4e57f88470313a8277eb299b221b86e3b1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 84bf82f4 08-Jan-2024 Harish Chegondi <harish.chegondi@intel.com>

drm/i915/xelpg: Extend driver code of Xe_LPG to Xe_LPG+

Xe_LPG+ (IP version 12.74) should take the same general code
paths as Xe_LPG (versions 12.70 and 12.71).

Xe_LPG+'s workaround list will be handled by the next patch.

Signed-off-by: Harish Chegondi <harish.chegondi@intel.com>
Signed-off-by: Haridhar Kalvala <haridhar.kalvala@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240108122738.14399-3-haridhar.kalvala@intel.com


# 5fbae687 27-Oct-2023 Dorcas AnonoLitunya <anonolitunya@gmail.com>

drm/i915/gt: Remove prohibited space after opening parenthesis

Removes space after opening parenthesis.

Fixes the checkpatch.pl error:
ERROR: space prohibited after that opening parenthesis '('

Signed-off-by: Dorcas AnonoLitunya <anonolitunya@gmail.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231027174745.4058-1-anonolitunya@gmail.com


# f1cdb599 11-Oct-2023 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Prevent potential null-ptr-deref in engine_init_common

If measure_breadcrumb_dw() returns an error and bce isn't created,
this commit ensures that intel_engine_destroy_pinned_context()
is not called with a NULL bce.

v2: Fix the subject s/UAF/null-ptr-deref(Jani)

Fixes: b35274993680 ("drm/i915: Create a kernel context for GGTT updates")
Cc: Oak Zeng <oak.zeng@intel.com>
Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231011122547.7085-1-nirmoy.das@intel.com


# e96aef07 09-Oct-2023 John Harrison <John.C.Harrison@Intel.com>

drm/i915/gt: More use of GT specific print helpers

A bunch of print messages got missed in the update to using sub-system
specific helpers. So update those.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231009183802.673882-2-John.C.Harrison@Intel.com


# 3f5f6288 26-Sep-2023 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Parameterize binder context creation

Add i915_ggtt_require_binder() to indicate that i915
needs to create binder context which will be used
by subsequent patch to enable i915_address_space vfuncs
that will use GPU commands to update GGTT.

Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230926083742.14740-5-nirmoy.das@intel.com


# b3527499 26-Sep-2023 Nirmoy Das <nirmoy.das@intel.com>

drm/i915: Create a kernel context for GGTT updates

Create a separate kernel context if a platform requires
GGTT updates using MI_UPDATE_GTT blitter command.

Subsequent patch will introduce methods to update
GGTT using this bind context and MI_UPDATE_GTT blitter
command.

v2: fix context leak on err(Oak)
v3: improve err handling and improve function names(Andi)
add docs for few functions.

Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230926083742.14740-3-nirmoy.das@intel.com


# 4485bd51 12-Sep-2023 Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

i915/pmu: Move execlist stats initialization to execlist specific setup

engine->stats is a union of execlist and guc stat objects. When execlist
specific fields are initialized, the initial state of guc stats is
affected. This results in bad busyness values when using GuC mode. Move
the execlist initialization from common code to execlist specific code.

Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230912212247.1828681-1-umesh.nerlige.ramappa@intel.com


# 28c46fee 21-Aug-2023 Matt Roper <matthew.d.roper@intel.com>

drm/i915: Consolidate condition for Wa_22011802037

The workaround bounds for Wa_22011802037 are somewhat complex and are
replicated in several places throughout the code. Pull the condition
out to a helper function to prevent mistakes if this condition needs to
change again in the future.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Gustavo Sousa <gustavo.sousa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230821180619.650007-12-matthew.d.roper@intel.com


# c524cd40 12-Sep-2023 Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

i915/pmu: Move execlist stats initialization to execlist specific setup

engine->stats is a union of execlist and guc stat objects. When execlist
specific fields are initialized, the initial state of guc stats is
affected. This results in bad busyness values when using GuC mode. Move
the execlist initialization from common code to execlist specific code.

Fixes: 77cdd054dd2c ("drm/i915/pmu: Connect engine busyness stats from GuC to pmu")
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230912212247.1828681-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 4485bd519f5d6d620a29d0547ff3c982bdeeb468)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# d3f23ab9 20-Jul-2023 Andrzej Hajda <andrzej.hajda@intel.com>

drm/i915: use direct alias for i915 in requests

i915_request contains direct alias to i915, there is no point to go via
rq->engine->i915.

v2: added missing rq.i915 initialization in measure_breadcrumb_dw.

Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230720113002.1541572-1-andrzej.hajda@intel.com


# 9c55105b 02-May-2023 Jani Nikula <jani.nikula@intel.com>

drm/i915/engine: fix kernel-doc function name for intel_engine_cleanup_common()

drivers/gpu/drm/i915/gt/intel_engine_cs.c:1525: warning: expecting prototype for intel_engines_cleanup_common(). Prototype was for intel_engine_cleanup_common() instead

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/33f8dfdf38be3e16675971e6983e3e300d4301a6.1683041799.git.jani.nikula@intel.com


# 83842357 10-Mar-2023 Nirmoy Das <nirmoy.das@intel.com>

drm/i915/gt: Update engine_init_common documentation

Change the function doc to reflect updated name.

v2: un-kerneldoc the comment(Matt).
:s/engines_init_common/engine_init_common(Andi)

Cc: Andi Shyti <andi.shyti@linux.intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Nirmoy Das <nirmoy.das@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230310101024.4700-1-nirmoy.das@intel.com


# 7416cbbc 10-Feb-2023 Andi Shyti <andi.shyti@linux.intel.com>

drm/i915/gt: Rename dev_priv to i915 for private data naming consistency

It has become common practice to refer to the drm_i915_private
structures as "i915". However, there are still instances where
they are referred to as "dev_priv". This inconsistency can make
grepping for information more difficult and does not maintain a
cohesive style throughout the code.

Rename all the "dev_priv" structures in the gt/* directory to
"i915".

Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230210150344.1066991-1-andi.shyti@linux.intel.com


# 1c388da5 23-Feb-2023 Matt Roper <matthew.d.roper@intel.com>

drm/i915/mtl: Add engine TLB invalidation

MTL's primary GT can continue to use the same engine TLB invalidation
programming as past Xe_HP-based platforms. However the media GT needs
some special handling:
* Invalidation registers on the media GT are singleton registers
(unlike the primary GT where they are still MCR).
* Since the GSC is now exposed as an engine, there's a new register to
use for TLB invalidation. The offset is identical to the compute
engine offset, but this is expected --- compute engines only exist on
the primary GT while the GSC only exists on the media GT.
* Although there's only a single GSC engine instance, it inexplicably
uses bit 1 to request invalidations rather than bit 0.

v2:
- Add a 'regs == xelpmp_regs' condition to the GSC instance handling.
If the registers change on a future platform, the GSC-specific
handling is likely to change as well. (Andrzej)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230224012009.3594691-1-matthew.d.roper@intel.com


# 1008266e 16-Feb-2023 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Consolidate TLB invalidation flow

As the logic for selecting the register and corresponsing values grew, the
code become a bit unsightly. Consolidate by storing the required values at
engine init time in the engine itself, and by doing so minimise the amount
of invariant platform and engine checks during each and every TLB
invalidation.

v2:
* Fail engine probe if TLB invlidations registers are unknown.

v3:
* Rebase.

v4:
* Fix handling of GEN8_M2TCR. (Andrzej)

v5:
* Tidy checkpatch warnings.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> # v1
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230216092123.159085-1-tvrtko.ursulin@linux.intel.com


# 4d70c746 30-Nov-2022 Andy Shevchenko <andriy.shevchenko@linux.intel.com>

i915: Move list_count() to list.h as list_count_nodes() for broader use

Some of the existing users, and definitely will be new ones, want to
count existing nodes in the list. Provide a generic API for that by
moving code from i915 to list.h.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20221130134838.23805-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 003e11ed 24-Jan-2023 John Harrison <John.C.Harrison@Intel.com>

drm/i915/mtl: Wa_22011802037: don't complain about missing regs on MTL

Wa_22011802037 requires waiting for an engine-specific register to
clear. A missing entry for GSC engine in the register table is flagged
as a drm_err. The drm_err was originally intended to catch missing
register entries for newer engines, however, it was later found that the
WA is only required for 'legacy' engines. So just drop the drm_err.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230124231111.1786429-1-umesh.nerlige.ramappa@intel.com


# a4be3dca 26-Jan-2023 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Fix up locking around dumping requests lists

The debugfs dump of requests was confused about what state requires
the execlist lock versus the GuC lock. There was also a bunch of
duplicated messy code between it and the error capture code.

So refactor the hung request search into a re-usable function. And
reduce the span of the execlist state lock to only the execlist
specific code paths. In order to do that, also move the report of hold
count (which is an execlist only concept) from the top level dump
function to the lower level execlist specific function. Also, move the
execlist specific code into the execlist source file.

v2: Rename some functions and move to more appropriate files (Daniele).
v3: Rename new execlist dump function (Daniele)

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com


# 3700e353 26-Jan-2023 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Fix request ref counting during error capture & debugfs dump

When GuC support was added to error capture, the reference counting
around the request object was broken. Fix it up.

The context based search manages the spinlocking around the search
internally. So it needs to grab the reference count internally as
well. The execlist only request based search relies on external
locking, so it needs an external reference count but within the
spinlock not outside it.

The only other caller of the context based search is the code for
dumping engine state to debugfs. That code wasn't previously getting
an explicit reference at all as it does everything while holding the
execlist specific spinlock. So, that needs updaing as well as that
spinlock doesn't help when using GuC submission. Rather than trying to
conditionally get/put depending on submission model, just change it to
always do the get/put.

v2: Explicitly document adding an extra blank line in some dense code
(Andy Shevchenko). Fix multiple potential null pointer derefs in case
of no request found (some spotted by Tvrtko, but there was more!).
Also fix a leaked request in case of !started and another in
__guc_reset_context now that intel_context_find_active_request is
actually reference counting the returned request.
v3: Add a _get suffix to intel_context_find_active_request now that it
grabs a reference (Daniele).
v4: Split the intel_guc_find_hung_context change to a separate patch
and rename intel_context_find_active_request_get to
intel_context_get_active_request (Tvrtko).
v5: s/locking/reference counting/ in commit message (Tvrtko)

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com


# 41bb543f 05-Jan-2023 Matt Roper <matthew.d.roper@intel.com>

drm/i915/mtl: Add initial gt workarounds

This patch introduces initial gt workarounds for the MTL platform.

v2: drop redundant/stale comments specifying wa platforms affected
(Lucas).
v3: drop additional redundant stale comments (MattR)

Bspec: 66622

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Atwood <matthew.s.atwood@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230105234408.277750-1-matthew.s.atwood@intel.com


# 6b7cbdbe 08-Dec-2022 Jonathan Cavitt <jonathan.cavitt@intel.com>

drm/i915/gsc: Disable GSC engine and power well if FW is not selected

The GSC CS is only used for communicating with the GSC FW, so no need to
initialize it if we're not going to use the FW. If we're not using
neither the engine nor the microcontoller, then we can also disable the
power well.

IMPORTANT: lack of GSC FW breaks media C6 due to opposing requirements
between CS setup and forcewake idleness. See in-code comment for detail.

Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: John C Harrison <John.C.Harrison@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221208200521.2928378-6-daniele.ceraolospurio@intel.com


# 5bc4b43d 26-Jan-2023 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Fix up locking around dumping requests lists

The debugfs dump of requests was confused about what state requires
the execlist lock versus the GuC lock. There was also a bunch of
duplicated messy code between it and the error capture code.

So refactor the hung request search into a re-usable function. And
reduce the span of the execlist state lock to only the execlist
specific code paths. In order to do that, also move the report of hold
count (which is an execlist only concept) from the top level dump
function to the lower level execlist specific function. Also, move the
execlist specific code into the execlist source file.

v2: Rename some functions and move to more appropriate files (Daniele).
v3: Rename new execlist dump function (Daniele)

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-4-John.C.Harrison@Intel.com
(cherry picked from commit a4be3dca53172d9d2091e4b474fb795c81ed3d6c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 86d8ddc7 26-Jan-2023 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Fix request ref counting during error capture & debugfs dump

When GuC support was added to error capture, the reference counting
around the request object was broken. Fix it up.

The context based search manages the spinlocking around the search
internally. So it needs to grab the reference count internally as
well. The execlist only request based search relies on external
locking, so it needs an external reference count but within the
spinlock not outside it.

The only other caller of the context based search is the code for
dumping engine state to debugfs. That code wasn't previously getting
an explicit reference at all as it does everything while holding the
execlist specific spinlock. So, that needs updaing as well as that
spinlock doesn't help when using GuC submission. Rather than trying to
conditionally get/put depending on submission model, just change it to
always do the get/put.

v2: Explicitly document adding an extra blank line in some dense code
(Andy Shevchenko). Fix multiple potential null pointer derefs in case
of no request found (some spotted by Tvrtko, but there was more!).
Also fix a leaked request in case of !started and another in
__guc_reset_context now that intel_context_find_active_request is
actually reference counting the returned request.
v3: Add a _get suffix to intel_context_find_active_request now that it
grabs a reference (Daniele).
v4: Split the intel_guc_find_hung_context change to a separate patch
and rename intel_context_find_active_request_get to
intel_context_get_active_request (Tvrtko).
v5: s/locking/reference counting/ in commit message (Tvrtko)

Fixes: dc0dad365c5e ("drm/i915/guc: Fix for error capture after full GPU reset with GuC")
Fixes: 573ba126aef3 ("drm/i915/guc: Capture error state on context reset")
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Andrzej Hajda <andrzej.hajda@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: Michael Cheng <michael.cheng@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Alan Previn <alan.previn.teres.alexis@intel.com>
Cc: Bruce Chang <yu.bruce.chang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230127002842.3169194-3-John.C.Harrison@Intel.com
(cherry picked from commit 3700e353781e27f1bc7222f51f2cc36cbeb9b4ec)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 51daa42d 29-Nov-2022 Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Revert "i915: Move list_count() to list.h for broader use"

This reverts commit a9efc04cfd05690e91279f41c2325c46335c43ef as it
breaks the build.

Link: https://lore.kernel.org/r/20221130131854.35b58b16@canb.auug.org.au
Link: https://lore.kernel.org/r/202211301628.iwMjPVMp-lkp@intel.com
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# a9efc04c 23-Nov-2022 Andy Shevchenko <andriy.shevchenko@linux.intel.com>

i915: Move list_count() to list.h for broader use

Some of the existing users, and definitely will be new ones, want to
count existing nodes in the list. Provide a generic API for that by
moving code from i915 to list.h.

Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Link: https://lore.kernel.org/r/20221123144901.40493-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


# 801543b2 09-Nov-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: stop including i915_irq.h from i915_trace.h

Turns out many of the files that need i915_reg.h get it implicitly via
{display/intel_de.h, gt/intel_context.h} -> i915_trace.h -> i915_irq.h
-> i915_reg.h. Since i915_trace.h doesn't actually need i915_irq.h,
makes sense to drop it, but that requires adding quite a few new
includes all over the place.

Prefer including i915_reg.h where needed instead of adding another
implicit include, because eventually we'll want to split up i915_reg.h
and only include the specific registers at each place.

Also some places actually needed i915_irq.h too.

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6e78a2e0ac1bffaf5af3b5ccc21dff05e6518cef.1668008071.git.jani.nikula@intel.com


# ef8281ab 02-Nov-2022 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/mtl: add GSC CS reset support

The GSC CS has its own dedicated bit in the GDRST register.

Bspec: 52549
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-5-daniele.ceraolospurio@intel.com


# 5fd974d1 02-Nov-2022 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/mtl: add initial definitions for GSC CS

Starting on MTL, the GSC is no longer managed with direct MMIO access,
but we instead have a dedicated command streamer for it. As a first step
for adding support for this CS, add the required definitions.
Note that, although it is now a CS, the GSC retains its old
class:instance value (OTHER_CLASS instance 6)

Bspec: 65308, 45605
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-2-daniele.ceraolospurio@intel.com


# d7a8680e 06-Oct-2022 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Improve long running compute w/a for GuC submission

A workaround was added to the driver to allow compute workloads to run
'forever' by disabling pre-emption on the RCS engine for Gen12.
It is not totally unbound as the heartbeat will kick in eventually
and cause a reset of the hung engine.

However, this does not work well in GuC submission mode. In GuC mode,
the pre-emption timeout is how GuC detects hung contexts and triggers
a per engine reset. Thus, disabling the timeout means also losing all
per engine reset ability. A full GT reset will still occur when the
heartbeat finally expires, but that is a much more destructive and
undesirable mechanism.

The purpose of the workaround is actually to give compute tasks longer
to reach a pre-emption point after a pre-emption request has been
issued. This is necessary because Gen12 does not support mid-thread
pre-emption and compute tasks can have long running threads.

So, rather than disabling the timeout completely, just set it to a
'long' value.

v2: Review feedback from Tvrtko - must hard code the 'long' value
instead of determining it algorithmically. So make it an extra CONFIG
definition. Also, remove the execlist centric comment from the
existing pre-emption timeout CONFIG option given that it applies to
more than just execlists.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Michal Mrozek <michal.mrozek@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-5-John.C.Harrison@Intel.com


# c3bd49cd 06-Oct-2022 John Harrison <John.C.Harrison@Intel.com>

drm/i915: Fix compute pre-emption w/a to apply to compute engines

An earlier patch added support for compute engines. However, it missed
enabling the anti-pre-emption w/a for the new engine class. So move
the 'compute capable' flag earlier and use it for the pre-emption w/a
test.

Fixes: c674c5b9342e ("drm/i915/xehp: CCS should use RCS setup functions")
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Cc: "Michał Winiarski" <michal.winiarski@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-3-John.C.Harrison@Intel.com


# 568944af 06-Oct-2022 John Harrison <John.C.Harrison@Intel.com>

drm/i915/guc: Limit scheduling properties to avoid overflow

GuC converts the pre-emption timeout and timeslice quantum values into
clock ticks internally. That significantly reduces the point of 32bit
overflow. On current platforms, worst case scenario is approximately
110 seconds. Rather than allowing the user to set higher values and
then get confused by early timeouts, add limits when setting these
values.

v2: Add helper functions for clamping (review feedback from Tvrtko).
v3: Add a bunch of BUG_ON range checks in addition to the checks
already in the clamping functions (Tvrtko)

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221006213813.1563435-2-John.C.Harrison@Intel.com


# dfa13f1b 14-Oct-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/gen8: Create separate reg definitions for new MCR registers

Gen8 was the first time our hardware had multicast registers (or at
least the first time the multicast nature was exposed and MMIO accesses
could be steered). There are some registers that transitioned from
singleton behavior to multicast during the gen7 -> gen8 transition;
let's duplicate the register definitions for those registers in
preparation for upcoming patches that will handle MCR registers in a
special manner.

The registers adjusted are:
* MISCCPCTL
* SAMPLER_INSTDONE
* ROW_INSTDONE
* ROW_CHICKEN2
* HALF_SLICE_CHICKEN1
* HALF_SLICE_CHICKEN3

v2:
- Use the gen8 version of HALF_SLICE_CHICKEN3 in GVT's gen9 engine MMIO
list. (Bala)
- Update to the gen8 version of MISCCPCTL in a couple new workarounds
that were recently added for DG2/PVC. (Bala)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20221014230239.1023689-2-matthew.d.roper@intel.com


# 78a03343 15-Sep-2022 Chris Wilson <chris.p.wilson@intel.com>

drm/i915/gt: Cleanup partial engine discovery failures

If we abort driver initialisation in the middle of gt/engine discovery,
some engines will be fully setup and some not. Those incompletely setup
engines only have 'engine->release == NULL' and so will leak any of the
common objects allocated.

v2:
- Drop the destroy_pinned_context() helper for now. It's not really
worth it with just a single callsite at the moment. (Janusz)

Signed-off-by: Chris Wilson <chris.p.wilson@intel.com>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220915232654.3283095-2-matthew.d.roper@intel.com


# 69a3738b 12-Sep-2022 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915: Skip applying copy engine fuses

Support for reading the fuses to check what are the Link Copy engines
was added in commit ad5f74f34201 ("drm/i915/pvc: read fuses for link
copy engines"). However they were added unconditionally because the
FUSE3 register is present since graphics version 10.

However the bitfield with meml3 fuses only exists since graphics version
12. Moreover, Link Copy engines are currently only available in PVC.
Tying additional copy engines to the meml3 fuses is not correct for
other platforms.

Make sure there is a check for `12.60 <= ver < 12.70`. Later platforms
may extend this function later if it's needed to fuse off copy engines.

Currently it's harmless as the Link Copy engines are still not exported:
info->engine_mask only has BCS0 set and the register is only read for
platforms that do have it.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220912-copy-engine-v1-1-ef92fd81758d@intel.com


# 0b3ed50e 09-Sep-2022 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: Extract function to apply media fuses

Just like is done for compute and copy engines, extract a function to
handle media engines. While at it, be consistent on using or not the
uncore/gt/info variable aliases.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220909-media-v2-2-6f20f322b4ef@intel.com


# ff21ed39 09-Sep-2022 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: Use MEDIA_VER() when handling media fuses

Check for media IP version instead of graphics since this is figuring
out the media engines' configuration. Currently the only platform with
non-matching graphics/media version is Meteor Lake: update the check in
gen11_vdbox_has_sfc() so it considers not only version 12, but also any
later version which then includes that platform.

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220909-media-v2-1-6f20f322b4ef@intel.com


# 03d2c54d 06-Sep-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/mtl: Use primary GT's irq lock for media GT

When we hook up interrupts (in the next patch), interrupts for the media
GT are still processed as part of the primary GT's interrupt flow. As
such, we should share the same IRQ lock with the primary GT. Let's
convert gt->irq_lock into a pointer and just point the media GT's
instance at the same lock the primary GT is using.

v2:
- Point media's gt->irq_lock at the primary GT lock properly. (Daniele)
- Fix jump target for intel_root_gt_init_early errors. (Daniele)

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-14-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 70fff19a 06-Sep-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915: Prepare more multi-GT initialization

We're going to introduce an additional intel_gt for MTL's media unit
soon. Let's provide a bit more multi-GT initialization framework in
preparation for that. The initialization will pull the list of GTs for
a platform from the device info structure. Although necessary for the
immediate MTL media enabling, this same framework will also be used
farther down the road when we enable remote tiles on xehpsdv and pvc.

v2:
- Re-add missing test for !HAS_EXTRA_GT_LIST in intel_gt_probe_all().

v3:
- Move intel_gt_definition struct to intel_gt_types.h. (Jani)
- Drop gtdef->setup(). For now we'll just use a switch() based on GT
type since we don't have too many different handlers for the
foreseeable future. (Jani)

Cc: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220906234934.3655440-6-matthew.d.roper@intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 068a0f5c 18-Aug-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/mtl: Don't mask off CCS according to DSS fusing

Unlike the Xe_HP platforms, MTL only has a single CCS engine; the
quad-based engine masking logic does not apply to this platform (or
presumably any future platforms that only have 0 or 1 CCS).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Radhakrishna Sripada <radhakrishna.sripada@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220818234202.451742-5-radhakrishna.sripada@intel.com


# 488e29fe 19-Aug-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: move platform_engine_mask to runtime info

If it's modified runtime, it's runtime info.

mock_gem_device() is the only one that modifies it. If that could be
fixed, we wouldn't have to do this.

Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Maarten Lankhort <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1261406b373998c1a171ee9ed91f7f562826eba6.1660910433.git.jani.nikula@intel.com


# 9a92732f 01-Jul-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr

Although all DSS belong to a single pool on Xe_HP platforms (i.e.,
they're not organized into slices from a topology point of view), we do
still need to pass 'group' and 'instance' targets when steering register
accesses to a specific instance of a per-DSS multicast register. The
rules for how to determine group and instance IDs (which previously used
legacy terms "slice" and "subslice") varies by platform. Some platforms
determine steering by gslice membership, some platforms by cslice
membership, and future platforms may have other rules.

Since looping over each DSS and performing steered unicast register
accesses is a relatively common pattern, let's add a dedicated iteration
macro to handle this (and replace the platform-specific "instdone" loop
we were using previously. This will avoid the calling code needing to
figure out the details about how to obtain steering IDs for a specific
DSS.

Most of the places where we use this new loop are in the GPU errorstate
code at the moment, but we do have some additional features coming in
the future that will also need to loop over each DSS and steer some
register accesses accordingly.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220701232006.1016135-1-matthew.d.roper@intel.com


# 0667429c 21-Jun-2022 Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend

For execlists backend, current implementation of Wa_22011802037 is to
stop the CS before doing a reset of the engine. This WA was further
extended to wait for any pending MI FORCE WAKEUPs before issuing a
reset. Add the extended steps in the execlist path of reset.

In addition, extend the WA to gen11.

v2: (Tvrtko)
- Clarify comments, commit message, fix typos
- Use IS_GRAPHICS_VER for gen 11/12 checks

v3: (Daneile)
- Drop changes to intel_ring_submission since WA does not apply to it
- Log an error if MSG IDLE is not defined for an engine

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: f6aa0d713c88 ("drm/i915: Add Wa_22011802037 force cs halt")
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com


# 3fe6c7f5 14-Jun-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/gt: Cleanup interface for MCR operations

Let's replace the assortment of intel_gt_* and intel_uncore_* functions
that operate on MCR registers with a cleaner set of interfaces:

* intel_gt_mcr_read -- unicast read from specific instance
* intel_gt_mcr_read_any[_fw] -- unicast read from any non-terminated
instance
* intel_gt_mcr_unicast_write -- unicast write to specific instance
* intel_gt_mcr_multicast_write[_fw] -- multicast write to all instances

We'll also replace the historic "slice" and "subslice" terminology with
"group" and "instance" to match the documentation for more recent
platforms; these days MCR steering applies to more types of replication
than just slice/subslice.

v2:
- Reference the new kerneldoc from i915.rst. (Jani)
- Tweak the wording of the documentation for a couple functions to
clarify the difference between "_fw" and non-"_fw" forms.

v3:
- s/read/write/ to fix copy-paste mistake in a couple comments.
(Harish)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220615001019.1821989-3-matthew.d.roper@intel.com


# e7858254 14-Jun-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/gt: Move multicast register handling to a dedicated file

Handling of multicast/replicated registers is spread across intel_gt.c
and intel_uncore.c today. As multicast handling and the related
steering logic gets more complicated with the addition of new platforms
and new rules it makes sense to centralize it all in one place.

For now the existing functions have been moved to the new .c/.h as-is.
Function renames and updates to operate in a more consistent manner will
be done in subsequent patches.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Jani Nikula <jani.nikula@linux.intel.com>
Reviewed-by: Harish Chegondi <harish.chegondi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220615001019.1821989-2-matthew.d.roper@intel.com


# b87d3901 01-Jun-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/sseu: Disassociate internal subslice mask representation from uapi

As with EU masks, it's easier to store subslice/DSS masks internally in
a format that's more natural for the driver to work with, and then only
covert into the u8[] uapi form when the query ioctl is invoked. Since
the hardware design changed significantly with Xe_HP, we'll use a union
to choose between the old "hsw-style" subslice masks or the newer xehp
mask. HSW-style masks will be stored in an array of u8's, indexed by
slice (there's never more than 6 subslices per slice on older
platforms). For Xe_HP and beyond where slices no longer exist, we only
need a single bitmask. However we already know that this mask is
eventually going to grow too large for a simple u64 to hold, so we'll
represent it in a manner that can be operated on by the utilities in
linux/bitmap.h.

v2:
- Fix typo: BIT(s) -> BIT(ss) in gen9_sseu_device_status()

v3:
- Eliminate sseu->ss_stride and just calculate the stride while
specifically handling uapi. (Tvrtko)
- Use BITMAP_BITS() macro to refer to size of masks rather than
passing I915_MAX_SS_FUSE_BITS directly. (Tvrtko)
- Report compute/geometry DSS masks separately when dumping Xe_HP SSEU
info. (Tvrtko)
- Restore dropped range checks to intel_sseu_has_subslice(). (Tvrtko)

v4:
- Make the bitmap size macro check the size of the .xehp field rather
than the containing union. (Tvrtko)
- Don't add GEM_BUG_ON() intel_sseu_has_subslice()'s check for whether
slice or subslice ID exceed sseu->max_[sub]slices; various loops
in the driver are expected to exceed these, so we should just
silently return 'false.'

v5:
- Move XEHP_BITMAP_BITS() to the header so that we can also replace a
usage of I915_MAX_SS_FUSE_BITS in one of the inline functions.
(Bala)
- Change the local variable in intel_slicemask_from_xehp_dssmask() from
u16 to 'unsigned long' to make it a bit more future-proof.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220601150725.521468-6-matthew.d.roper@intel.com


# ad5f74f3 05-May-2022 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/pvc: read fuses for link copy engines

The new Link Copy engines in PVC may be fused off according to the
mslice_mask. Each bit of the MEML3_EN_MASK we read from the
GEN10_MIRROR_FUSE3 register disables a pair of link copy engines.

v2 (Tvrtko):
- Minor cosmetic changes: s/u8/unsigned long/, use instance local
variable. (Tvrtko)

Bspec: 44483
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220505213812.3979301-13-matthew.d.roper@intel.com


# 8caaf7ad 05-May-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/pvc: Reset support for new copy engines

Add the reset support for new copy engines in PVC.

Bspec: 52549
Original-author: CQ Tang
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220505213812.3979301-11-matthew.d.roper@intel.com


# 69f8afdb 05-May-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/pvc: Engine definitions for new copy engines

This patch adds the basic definitions needed to support
new copy engines. Also updating the cmd_info to accommodate
new engines, as the engine id's of legacy engines have been
changed.

v2:
- Add _BCS(n) definition, similar to other engines. (Tvrtko)
- Add I915_MAX_BCS definition, similar to other engnes. (Prathap)
- Move GVT change to avoid u16 overflow to its own patch. (Tvrtko)

Original-author: CQ Tang
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220505213812.3979301-9-matthew.d.roper@intel.com


# a7a47a5d 21-Jun-2022 Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

drm/i915/reset: Add additional steps for Wa_22011802037 for execlist backend

For execlists backend, current implementation of Wa_22011802037 is to
stop the CS before doing a reset of the engine. This WA was further
extended to wait for any pending MI FORCE WAKEUPs before issuing a
reset. Add the extended steps in the execlist path of reset.

In addition, extend the WA to gen11.

v2: (Tvrtko)
- Clarify comments, commit message, fix typos
- Use IS_GRAPHICS_VER for gen 11/12 checks

v3: (Daneile)
- Drop changes to intel_ring_submission since WA does not apply to it
- Log an error if MSG IDLE is not defined for an engine

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Fixes: f6aa0d713c88 ("drm/i915: Add Wa_22011802037 force cs halt")
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220621192105.2100585-1-umesh.nerlige.ramappa@intel.com
(cherry picked from commit 0667429ce68e0b08f9f1fec8fd0b1f57228f605e)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>


# f6aa0d71 15-Apr-2022 Tilak Tangudu <tilak.tangudu@intel.com>

drm/i915: Add Wa_22011802037 force cs halt

Prior to doing a reset, SW must ensure command streamer is stopped,
as a workaround, to eliminate a race condition in GPM flow.
Setting both the ring stop and prefetch disable bits, will cause the
command streamer to halt.

Signed-off-by: Tilak Tangudu <tilak.tangudu@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220415224025.3693037-2-umesh.nerlige.ramappa@intel.com


# a0f1f7b4 21-Mar-2022 Alan Previn <alan.previn.teres.alexis@intel.com>

drm/i915/guc: Print the GuC error capture output register list.

Print the GuC captured error state register list (string names
and values) when gpu_coredump_state printout is invoked via
the i915 debugfs for flushing the gpu error-state that was
captured prior.

Since GuC could have reported multiple engine register dumps
in a single notification event, parse the captured data
(appearing as a stream of structures) to identify each dump as
a different 'engine-capture-group-output'.

Finally, for each 'engine-capture-group-output' that is found,
verify if the engine register dump corresponds to the
engine_coredump content that was previously populated by the
i915_gpu_coredump function. That function would have copied
the context's vma's including the bacth buffer during the
G2H-context-reset notification that occurred earlier. Perform
this verification check by comparing guc_id, lrca and engine-
instance obtained from the 'engine-capture-group-output' vs a
copy of that same info taken during i915_gpu_coredump. If
they match, then print those vma's as well (such as the batch
buffers).

NOTE: the output format was verified using the gem_exec_capture
IGT test.

Signed-off-by: Alan Previn <alan.previn.teres.alexis@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220321164527.2500062-14-alan.previn.teres.alexis@intel.com


# 6f270e14 16-Mar-2022 Matthew Brost <matthew.brost@intel.com>

drm/i915: Add logical mapping for video decode engines

Add logical mapping for VDBOXs. This mapping is required for
split-frame workloads, which otherwise fail with

00000000-F8C53528: [GUC] 0441-INVALID_ENGINE_SUBMIT_MASK

... if the application is using the logical id to reorder the engines and
then using it for the batch buffer submission. It's not a big problem on
media version 11 and 12 as they have only 2 instances of VCS and the
logical to physical mapping is monotonically increasing - if the
application is not using the logical id.

Changing it for the previous platforms allows the media driver
implementation for the next ones (12.50 and above) to be the same,
checking the logical id. It should also not introduce any bug for the
old versions of userspace not checking the id.

The mapping added here is the complete map needed by XEHPSDV. Previous
platforms with only 2 instances will just use a partial map and should
still work.

v2: Remove static from map variable (José)

Cc: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
[ Extend the mapping to media versions 11 and 12 and give proper
justification in the commit message why ]
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220316234538.434357-2-lucas.demarchi@intel.com


# f9576e36 03-Mar-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/xehp: Support platforms with CCS engines but no RCS

In the past we've always assumed that an RCS engine is present on every
platform. However now that we have compute engines there may be
platforms that have CCS engines but no RCS, or platforms that are
designed to have both, but have the RCS engine fused off.

Various engine-centric initialization that only needs to be done a
single time for the group of RCS+CCS engines can't rely on being setup
with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag
that will be assigned to a single engine in the group; whichever engine
has this flag will be responsible for some of the general setup
(RCU_MODE programming, initialization of certain workarounds, etc.).

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220303223435.2793124-1-matthew.d.roper@intel.com


# ff9fbe7c 25-Feb-2022 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915: Use str_enabled_disabled()

Remove the local enableddisabled() implementation and adopt the
str_enabled_disabled() from linux/string_helpers.h.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225234631.3725943-3-lucas.demarchi@intel.com


# 01fabda8 25-Feb-2022 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915: Use str_yes_no()

Remove the local yesno() implementation and adopt the str_yes_no() from
linux/string_helpers.h.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225234631.3725943-1-lucas.demarchi@intel.com


# 88ed07cb 01-Mar-2022 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/xehp: handle fused off CCS engines

HW resources are divided across the active CCS engines at the compute
slice level, with each CCS having priority on one of the cslices.
If a compute slice has no enabled DSS, its paired compute engine is not
usable in full parallel execution because the other ones already fully
saturate the HW, so consider it fused off.

v2 (José):
- moved it to its own function
- fixed definition of ccs_mask

v3 (Matt):
- Replace fls() condition with a simple IP version test

v4 (Matt):
- Don't try to calculate a ccs_mask using
intel_slicemask_from_dssmask() until we've determined that we're
running on an Xe_HP platform where the logic makes sense (and won't
overflow).

Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302052008.1884985-1-matthew.d.roper@intel.com


# 87cb6d80 01-Mar-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/xehp: Enable ccs/dual-ctx in RCU_MODE

We have to specify in the Render Control Unit Mode register
when CCS is enabled.

v2:
- Move RCU_MODE programming to a helper function. (Tvrtko)
- Clean up and clarify comments. (Tvrtko)
- Add RCU_MODE to the GuC save/restore list. (Daniele)
v3:
- Move this patch before the GuC ADS update to enable compute engines;
the definition of RCU_MODE and its insertion into the save/restore
list moves to this patch. (Daniele)
v4:
- Call xehp_enable_ccs_engines() directly in guc_resume() and
execlists_resume() rather than adding an extra layer of wrapping to
the engine->resume() vfunc. (Umesh)

Bspec: 46034
Original-author: Michel Thierry
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Cc: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220302001554.1836066-1-matthew.d.roper@intel.com


# adfadb56 01-Mar-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/xehp: Define context scheduling attributes in lrc descriptor

In Dual Context mode the EUs are shared between render and compute
command streamers. The hardware provides a field in the lrc descriptor
to indicate the prioritization of the thread dispatch associated to the
corresponding context.

The context priority is set to 'low' at creation time and relies on the
existing context priority to set it to low/normal/high.

Bspec: 46145, 46260
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Prasad Nallani <prasad.nallani@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-8-matthew.d.roper@intel.com


# f4c1fdb9 01-Mar-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915: Move context descriptor fields to intel_lrc.h

This is a more appropriate header for these definitions.

v2:
- Cleanup whitespace. (Lucas)

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-7-matthew.d.roper@intel.com


# c674c5b9 01-Mar-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/xehp: CCS should use RCS setup functions

The compute engine handles the same commands the render engine can
(except 3D pipeline), so it makes sense that CCS is more similar to RCS
than non-render engines.

The CCS context state (lrc) is also similar to the render one, so reuse
it. Note that the compute engine has its own CTX_R_PWR_CLK_STATE
register.

In order to avoid having multiple RCS && CCS checks, add the following
engine flag:
- I915_ENGINE_HAS_RCS_REG_STATE - use the render (larger) reg state ctx.

BSpec: 46260
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-6-matthew.d.roper@intel.com


# 4b88ad50 01-Mar-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/xehp: CCS shares the render reset domain

The reset domain is shared between render and all compute engines,
so resetting one will affect the others.

Note: Before performing a reset on an RCS or CCS engine, the GuC will
attempt to preempt-to-idle the other non-hung RCS/CCS engines to avoid
impacting other clients (since some shared modules will be reset). If
other engines are executing non-preemptable workloads, the impact is
unavoidable and some work may be lost.

Bspec: 52549
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-3-matthew.d.roper@intel.com


# 944823c9 01-Mar-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/xehp: Define compute class and engine

Introduce a Compute Command Streamer (CCS), which has access to
the media and GPGPU pipelines (but not the 3D pipeline).

To begin with, define the compute class/engine common functions, based
on the existing render ones.

v2:
- Add kerneldoc for drm_i915_gem_engine_class since we're adding a new
element to it. (Daniel)
- Make engine class <-> guc class converters use lookup tables to make
it more clear/explicit how the IDs map. (Tvrtko)

v3:
- Don't update uapi for now; we'll just include the driver-internal
changes for the time being.

Bspec: 46167, 45544
Original-author: Michel Thierry
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-2-matthew.d.roper@intel.com


# 64b2a6a0 17-Feb-2022 Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>

drm/i915/gt: use get_reset_domain() helper

We dont need to implement reset_domain in intel_engine
_setup(), but can be done as a helper. Implemented as
engine->reset_domain = get_reset_domain().

Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220217123223.748184-1-tejaskumarx.surendrakumar.upadhyay@intel.com


# 74fc5954 10-Feb-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: move i915_cache_level_str() static in i915_debugfs.c

Move the function next to the only user.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/dc0901dbe424c21b3e03b875bf5b944b214d1af4.1644507885.git.jani.nikula@intel.com


# b508d01f 10-Feb-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: split out i915_gem_internal.h from i915_drv.h

We already have the i915_gem_internal.c file.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/6715d1f3232c445990630bb3aac00f279f516fee.1644507885.git.jani.nikula@intel.com


# 0d6419e9 27-Jan-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915: Move GT registers to their own header file

This is a huge, chaotic mass of registers copied over as-is without any
real cleanup. We'll come back and organize these better, align on
consistent coding style, remove dead code, etc. in separate patches
later that will be easier to review.

v2:
- Add missing include in intel_pxp_irq.c
v3:
- Correct a few indentation errors (Lucas)
- Minor conflict resolution

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com


# 202b1f4c 10-Jan-2022 Matt Roper <matthew.d.roper@intel.com>

drm/i915/gt: Move engine registers to their own header

Let's continue breaking up and cleaning up the massive i915_reg.h file
by moving all registers that are defined in relation to an engine base
to their own header.

There are probably a bunch of other "engine registers" that we haven't
moved yet (especially those that belong to the render engine in the
0x2??? range), but this is a relatively straightforward first step.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com


# 60dc43d1 10-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Use struct vma_resource instead of struct vma_snapshot

There is always a struct vma_resource guaranteed to be alive when we
access a corresponding struct vma_snapshot.

So ditch the latter and instead of allocating vma_snapshots, reference
the already existning vma_resource.

This requires a couple of extra members in struct vma_resource but that's
a small price to pay for the simplification.

v2:
- Fix a missing include and declaration (kernel test robot <lkp@intel.com>)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-7-thomas.hellstrom@linux.intel.com


# 39a2bd34 10-Jan-2022 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Use the vma resource as argument for gtt binding / unbinding

When introducing asynchronous unbinding, the vma itself may no longer
be alive when the actual binding or unbinding takes place.

Update the gtt i915_vma_ops accordingly to take a struct i915_vma_resource
instead of a struct i915_vma for the bind_vma() and unbind_vma() ops.
Similarly change the insert_entries() op for struct i915_address_space.

Replace a couple of i915_vma_snapshot members with their newly introduced
i915_vma_resource counterparts, since they have the same lifetime.

Also make sure to avoid changing the struct i915_vma_flags (in particular
the bind flags) async. That should now only be done sync under the
vm mutex.

v2:
- Update the vma_res::bound_flags when binding to the aliased ggtt
v6:
- Remove I915_VMA_ALLOC_BIT (Matthew Auld)
- Change some members of struct i915_vma_resource from unsigned long to u64
(Matthew Auld)
v7:
- Fix vma resource size parameters to be u64 rather than unsigned long
(Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-3-thomas.hellstrom@linux.intel.com


# 23d639d7 07-Jan-2022 Jani Nikula <jani.nikula@intel.com>

drm/i915: split out i915_cmd_parser.h from i915_drv.h

We already have the i915_cmd_parser.c file.

Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1a02b8788266f4f2fd4de12808b55c4a66179e98.1641561552.git.jani.nikula@intel.com


# 20cddfcc 06-Dec-2021 Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>

drm/i915/gt: Use hw_engine_masks as reset_domains

We need a way to reset engines by their reset domains.
This change sets up way to fetch reset domains of each
engine globally.

Changes since V1:
- Use static reset domain array - Ville and Tvrtko
- Use BUG_ON at appropriate place - Tvrtko

Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211206081026.4024401-1-tejaskumarx.surendrakumar.upadhyay@intel.com


# ff20afc4 29-Nov-2021 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915: Update error capture code to avoid using the current vma state

With asynchronous migrations, the vma state may be several migrations
ahead of the state that matches the request we're capturing.
Address that by introducing an i915_vma_snapshot structure that
can be used to snapshot relevant state at request submission.
In order to make sure we access the correct memory, the snapshots take
references on relevant sg-tables and memory regions.

Also move the capture list allocation out of the fence signaling
critical path and use the CONFIG_DRM_I915_CAPTURE_ERROR define to
avoid compiling in members and functions used for error capture
when they're not used.

Finally, Introduce lockdep annotation.

v4:
- Break out the capture allocation mode change to a separate patch.
v5:
- Fix compilation error in the !CONFIG_DRM_I915_CAPTURE_ERROR case
(kernel test robot)
v6:
- Use #if IS_ENABLED() instead of #ifdef to match driver style.
- Move yet another change of allocation mode to the separate patch.
- Commit message rework due to patch reordering.
v7:
- Adjust for removal of region refcounting.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211129202245.472043-1-thomas.hellstrom@linux.intel.com


# 77cdd054 26-Oct-2021 Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

drm/i915/pmu: Connect engine busyness stats from GuC to pmu

With GuC handling scheduling, i915 is not aware of the time that a
context is scheduled in and out of the engine. Since i915 pmu relies on
this info to provide engine busyness to the user, GuC shares this info
with i915 for all engines using shared memory. For each engine, this
info contains:

- total busyness: total time that the context was running (total)
- id: id of the running context (id)
- start timestamp: timestamp when the context started running (start)

At the time (now) of sampling the engine busyness, if the id is valid
(!= ~0), and start is non-zero, then the context is considered to be
active and the engine busyness is calculated using the below equation

engine busyness = total + (now - start)

All times are obtained from the gt clock base. For inactive contexts,
engine busyness is just equal to the total.

The start and total values provided by GuC are 32 bits and wrap around
in a few minutes. Since perf pmu provides busyness as 64 bit
monotonically increasing values, there is a need for this implementation
to account for overflows and extend the time to 64 bits before returning
busyness to the user. In order to do that, a worker runs periodically at
frequency = 1/8th the time it takes for the timestamp to wrap. As an
example, that would be once in 27 seconds for a gt clock frequency of
19.2 MHz.

Note:
There might be an over-accounting of busyness due to the fact that GuC
may be updating the total and start values while kmd is reading them.
(i.e kmd may read the updated total and the stale start). In such a
case, user may see higher busyness value followed by smaller ones which
would eventually catch up to the higher value.

v2: (Tvrtko)
- Include details in commit message
- Move intel engine busyness function into execlist code
- Use union inside engine->stats
- Use natural type for ping delay jiffies
- Drop active_work condition checks
- Use for_each_engine if iterating all engines
- Drop seq locking, use spinlock at GuC level to update engine stats
- Document worker specific details

v3: (Tvrtko/Umesh)
- Demarcate GuC and execlist stat objects with comments
- Document known over-accounting issue in commit
- Provide a consistent view of GuC state
- Add hooks to gt park/unpark for GuC busyness
- Stop/start worker in gt park/unpark path
- Drop inline
- Move spinlock and worker inits to GuC initialization
- Drop helpers that are called only once

v4: (Tvrtko/Matt/Umesh)
- Drop addressed opens from commit message
- Get runtime pm in ping, remove from the park path
- Use cancel_delayed_work_sync in disable_submission path
- Update stats during reset prepare
- Skip ping if reset in progress
- Explicitly name execlists and GuC stats objects
- Since disable_submission is called from many places, move resetting
stats to intel_guc_submission_reset_prepare

v5: (Tvrtko)
- Add a trylock helper that does not sleep and synchronize PMU event
callbacks and worker with gt reset

v6: (CI BAT failures)
- DUTs using execlist submission failed to boot since __gt_unpark is
called during i915 load. This ends up calling the GuC busyness unpark
hook and results in kick-starting an uninitialized worker. Let
park/unpark hooks check if GuC submission has been initialized.
- drop cant_sleep() from trylock helper since rcu_read_lock takes care
of that.

v7: (CI) Fix igt@i915_selftest@live@gt_engines
- For GuC mode of submission the engine busyness is derived from gt time
domain. Use gt time elapsed as reference in the selftest.
- Increase busyness calculation to 10ms duration to ensure batch runs
longer and falls within the busyness tolerances in selftest.

v8:
- Use ktime_get in selftest as before
- intel_reset_trylock_no_wait results in a lockdep splat that is not
trivial to fix since the PMU callback runs in irq context and the
reset paths are tightly knit into the driver. The test that uncovers
this is igt@perf_pmu@faulting-read. Drop intel_reset_trylock_no_wait,
instead use the reset_count to synchronize with gt reset during pmu
callback. For the ping, continue to use intel_reset_trylock since ping
is not run in irq context.

- GuC PM timestamp does not tick when GuC is idle. This can potentially
result in wrong busyness values when a context is active on the
engine, but GuC is idle. Use the RING TIMESTAMP as GPU timestamp to
process the GuC busyness stats. This works since both GuC timestamp and
RING timestamp are synced with the same clock.

- The busyness stats may get updated after the batch starts running.
This delay causes the busyness reported for 100us duration to fall
below 95% in the selftest. The only option at this time is to wait for
GuC busyness to change from idle to active before we sample busyness
over a 100us period.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-2-umesh.nerlige.ramappa@intel.com


# 344e6947 26-Oct-2021 Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>

drm/i915/pmu: Add a name to the execlists stats

In preparation for GuC pmu stats, add a name to the execlists stats
structure so that it can be differentiated from the GuC stats.

Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-1-umesh.nerlige.ramappa@intel.com


# e5e32171 14-Oct-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Connect UAPI to GuC multi-lrc interface

Introduce 'set parallel submit' extension to connect UAPI to GuC
multi-lrc interface. Kernel doc in new uAPI should explain it all.

IGT: https://patchwork.freedesktop.org/patch/447008/?series=93071&rev=1
media UMD: https://github.com/intel/media-driver/pull/1252

v2:
(Daniel Vetter)
- Add IGT link and placeholder for media UMD link
v3:
(Kernel test robot)
- Fix warning in unpin engines call
(John Harrison)
- Reword a bunch of the kernel doc
v4:
(John Harrison)
- Add comment why perma-pin is done after setting gem context
- Update some comments / docs for proto contexts
v5:
(John Harrison)
- Rework perma-pin comment
- Add BUG_IN if context is pinned when setting gem context

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-17-matthew.brost@intel.com


# 4f3059dc 14-Oct-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Add logical engine mapping

Add logical engine mapping. This is required for split-frame, as
workloads need to be placed on engines in a logically contiguous manner.

v2:
(Daniel Vetter)
- Add kernel doc for new fields
v3:
(Tvrtko)
- Update comment for new logical_mask field
v4:
(John Harrison)
- Update comment for new logical_mask field

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-6-matthew.brost@intel.com


# 3e42cc61 22-Sep-2021 Thomas Hellström <thomas.hellstrom@linux.intel.com>

drm/i915/gt: Register the migrate contexts with their engines

Pinned contexts, like the migrate contexts need reset after resume
since their context image may have been lost. Also the GuC needs to
register pinned contexts.

Add a list to struct intel_engine_cs where we add all pinned contexts on
creation, and traverse that list at resume time to reset the pinned
contexts.

This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now,
but proper LMEM backup / restore is needed for full suspend functionality.
However, note that even with full LMEM backup / restore it may be
desirable to keep the reset since backing up the migrate context images
must happen using memcpy() after the migrate context has become inactive,
and for performance- and other reasons we want to avoid memcpy() from
LMEM.

Also traverse the list at guc_init_lrc_mapping() calling
guc_kernel_context_pin() for the pinned contexts, like is already done
for the kernel context.

v2:
- Don't reset the contexts on each __engine_unpark() but rather at
resume time (Chris Wilson).
v3:
- Reset contexts in the engine sanitize callback. (Chris Wilson)

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Brost Matthew <matthew.brost@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-6-thomas.hellstrom@linux.intel.com


# ff04f8be 17-Sep-2021 Matt Roper <matthew.d.roper@intel.com>

drm/i915/xehp: Check new fuse bits for SFC availability

Xe_HP adds some new bits to the FUSE1 register to let us know whether a
given SFC unit is present. We should take this into account while
initializing SFC availability to our VCS and VECS engines.

While we're at it, update the FUSE1 register definition to use
REG_GENMASK / REG_FIELD_GET notation.

Note that, the bspec confusingly names the fuse bits "disable" despite
the register reflecting the *enable* status of the SFC units. The
original architecture documents which the bspec is based on do properly
name this field "SFC_ENABLE."

Bspec: 52543
Cc: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210917161203.812251-2-matthew.d.roper@intel.com


# 89f2e7ab 05-Aug-2021 Matt Roper <matthew.d.roper@intel.com>

drm/i915/dg2: Report INSTDONE_GEOM values in error state

Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team
has indicated that having these reported in the error state would be
useful for debugging GPU hangs. These registers are replicated per-DSS
with gslice steering.

Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805163647.801064-4-matthew.d.roper@intel.com


# fa9899da 05-Aug-2021 Matt Roper <matthew.d.roper@intel.com>

drm/i915/xehp: Loop over all gslices for INSTDONE processing

We no longer have traditional slices on Xe_HP platforms, but the
INSTDONE registers are replicated according to gslice representation
which is similar. We can mostly re-use the existing instdone code with
just a few modifications:

* Create an alternate instdone loop macro that will iterate over the
flat DSS space, but still provide the gslice/dss steering values for
compatibility with the legacy code.

* We should allocate INSTDONE storage space according to the maximum
number of gslices rather than the maximum number of legacy slices to
ensure we have enough storage space to hold all of the values. XeHP
design has 8 gslices, whereas older platforms never had more than 3
slices.

Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210805163647.801064-3-matthew.d.roper@intel.com


# 6266992c 28-Jul-2021 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: remove GRAPHICS_VER == 10

Replace all remaining handling of GRAPHICS_VER {==,>=} 10 with
{==,>=} 11. With the removal of CNL, there is no platform with graphics
version equals 10.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728220326.1578242-5-lucas.demarchi@intel.com


# dc0dad36 26-Jul-2021 John Harrison <John.C.Harrison@Intel.com>

drm/i915/guc: Fix for error capture after full GPU reset with GuC

In the case of a full GPU reset (e.g. because GuC has died or because
GuC's hang detection has been disabled), the driver can't rely on GuC
reporting the guilty context. Instead, the driver needs to scan all
active contexts and find one that is currently executing, as per the
execlist mode behaviour. In GuC mode, this scan is different to
execlist mode as the active request list is handled very differently.

Similarly, the request state dump in debugfs needs to be handled
differently when in GuC submission mode.

Also refactured some of the request scanning code to avoid duplication
across the multiple code paths that are now replicating it.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-20-matthew.brost@intel.com


# 573ba126 26-Jul-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Capture error state on context reset

We receive notification of an engine reset from GuC at its
completion. Meaning GuC has potentially cleared any HW state
we may have been interested in capturing. GuC resumes scheduling
on the engine post-reset, as the resets are meant to be transparent,
further muddling our error state.

There is ongoing work to define an API for a GuC debug state dump. The
suggestion for now is to manually disable FW initiated resets in cases
where debug state is needed.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-19-matthew.brost@intel.com


# a95d1160 26-Jul-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbs

With GuC virtual engines the physical engine which a request executes
and completes on isn't known to the i915. Therefore we can't attach a
request to a physical engines breadcrumbs. To work around this we create
a single breadcrumbs per engine class when using GuC submission and
direct all physical engine interrupts to this breadcrumbs.

v2:
(John H)
- Rework header file structure so intel_engine_mask_t can be in
intel_engine_types.h

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
CC: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-6-matthew.brost@intel.com


# 55612025 26-Jul-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915/guc: GuC virtual engines

Implement GuC virtual engines. Rather simple implementation, basically
just allocate an engine, setup context enter / exit function to virtual
engine specific functions, set all other variables / functions to guc
versions, and set the engine mask to that of all the siblings.

v2: Update to work with proto-ctx
v3:
(Daniele)
- Drop include, add comment to intel_virtual_engine_has_heartbeat

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-2-matthew.brost@intel.com


# 816753c0 22-Jul-2021 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: nuke gen6_hw_id

This is only used by GRAPHICS_VER == 6 and GRAPHICS_VER == 7. All other
recent platforms do not depend on this field, so it doesn't make much
sense to keep it generic like that. Instead, just do a mapping from
engine class to HW ID in the single place that is needed.

v2: use macros with the direct register address instead of calculating
from the legacy HW_ID (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723002551.3906535-1-lucas.demarchi@intel.com


# 938c778f 23-Jul-2021 John Harrison <John.C.Harrison@Intel.com>

drm/i915/xehp: Extra media engines - Part 1 (engine definitions)

Xe_HP can have a lot of extra media engines. This patch adds the basic
definitions for them.

v2:
- Re-order intel_gt_info and intel_device_info slightly to avoid
unnecessary padding now that we've increased the size of
intel_engine_mask_t. (Tvrtko)
v3:
- Drop the .hw_id assignments. (Lucas)
v4:
- Fix graphics_ver typo for VCS4 (should be 12, not 11). (Lucas)

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723191024.1553405-1-matthew.d.roper@intel.com


# eea97e42 21-Jul-2021 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/xehp: VDBOX/VEBOX fusing registers are enable-based

On Xe_HP the fusing register is renamed and changed to have the "enable"
semantics, but otherwise remains compatible (mmio address, bitmask
ranges) with older platforms.

To simplify things we do not add a new register definition but just stop
inverting the fusing masks before processing them.

Bspec: 52615
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721223043.834562-6-matthew.d.roper@intel.com


# 265b5ee0 20-Jul-2021 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: rename legacy engine->hw_id to engine->gen6_hw_id

We kept adding new engines and for that increasing hw_id unnecessarily:
it's not used since GRAPHICS_VER == 8. Prepend "gen6" to the field and
try to pack it in the structs to give a hint this field is actually not
used in recent platforms.

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210720232014.3302645-4-lucas.demarchi@intel.com


# f9be3000 20-Jul-2021 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: nuke unused legacy engine hw_id

The engine hw_id is only used by RING_FAULT_REG(), which is not used
by GRAPHICS_VER >= 8. We did use hw_id on recent platforms to set
the engine's guc_id, but that is not the case anymore since
commit c784e5249e77 ("drm/i915/guc: Update to use firmware v49.0.1"):
now we only use class and id information to generate guc_id.

We tend to keep adding new defines just to be consistent, but let's try
to remove them and let them defined to 0 for engines that only exist on
gen8+ platforms.

v2: Reword commit message and add information about when we stopped
using hw_id (Matt Roper)

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210720232014.3302645-3-lucas.demarchi@intel.com


# 442e049a 21-Jul-2021 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

drm/i915/gen12: Use fuse info to enable SFC

In Gen12 there are various fuse combinations and in each configuration
vdbox engine may be connected to SFC depending on which engines are
available, so we need to set the SFC capability based on fuse value from
the hardware. Even numbered physical instance always have SFC, odd
numbered physical instances have SFC only if previous even instance is
fused off.

v2:
- Minor style & typo fixes (Tvrtko)
- Drop an unwanted 'inline' (Tvrtko)

Bspec: 48028
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721223043.834562-7-matthew.d.roper@intel.com


# 74e4b909 08-Jul-2021 Jason Ekstrand <jason@jlekstrand.net>

drm/i915: Stop storing the ring size in the ring pointer (v3)

Previously, we were storing the ring size in the ring pointer before it
was actually allocated. We would then guard setting the ring size on
checking for CONTEXT_ALLOC_BIT. This is error-prone at best and really
only saves us a few bytes on something that already burns at least 4K.
Instead, this patch adds a new ring_size field and makes everything use
that.

v2 (Daniel Vetter):
- Replace 512 * SZ_4K with SZ_2M

v2 (Jason Ekstrand):
- Rebase on top of page migration code

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-3-jason@jlekstrand.net


# 27e4b467 01-Jul-2021 Thomas Zimmermann <tzimmermann@suse.de>

drm/i915: Use the correct IRQ during resume

The code in xcs_resume() probably didn't work as intended. It uses
struct drm_device.irq, which is allocated to 0, but never initialized
by i915 to the device's interrupt number.

Change all calls to synchronize_hardirq() to intel_synchronize_irq(),
which uses the correct interrupt. _hardirq() functions are not needed
in this context.

v5:
* go back to _hardirq() after PCI probe reported wrong
context; add rsp comment
v4:
* switch everything to intel_synchronize_irq() (Daniel)
v3:
* also use intel_synchronize_hardirq() at another callsite
v2:
* wrap irq code in intel_synchronize_hardirq() (Ville)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 536f77b1caa0 ("drm/i915/gt: Call stop_ring() from ring resume, again")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210701173618.10718-2-tzimmermann@suse.de


# 22916bad 17-Jun-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Move submission tasklet to i915_sched_engine

The submission tasklet operates on i915_sched_engine, thus it is the
correct place for it.

v3:
(Jason Ekstrand)
Change sched_engine->engine to a void* private data pointer
Add kernel doc
v4:
(Daniele)
Update private_data comment
Set queue_priority_hint in kick_execlists
v5:
(CI)
Rebase and fix build error

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-9-matthew.brost@intel.com


# 3f623e06 17-Jun-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Move engine->schedule to i915_sched_engine

The schedule function should be in the schedule object.

v3:
(Jason Ekstrand)
Add kernel doc

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-6-matthew.brost@intel.com


# 349a2bc5 17-Jun-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Move active tracking to i915_sched_engine

Move active request tracking and its lock to i915_sched_engine. This
lock is also the submission lock so having it in the i915_sched_engine
is the correct place.

v3:
(Jason Ekstrand)
Add kernel doc
v6:
Rebase

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.comk>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-5-matthew.brost@intel.com


# 074bb195 17-Jun-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Add i915_sched_engine_is_empty function

Add wrapper function around RB tree to determine if i915_sched_engine is
empty.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-3-matthew.brost@intel.com


# 3e28d371 17-Jun-2021 Matthew Brost <matthew.brost@intel.com>

drm/i915: Move priolist to new i915_sched_engine object

Introduce i915_sched_engine object which is lower level data structure
that i915_scheduler / generic code can operate on without touching
execlist specific structures. This allows additional submission backends
to be added without breaking the layering. Currently the execlists
backend uses 1 of these object per each engine (physical or virtual) but
future backends like the GuC will point to less instances utilizing the
reference counting.

This is a bit of detour to integrating the i915 with the DRM scheduler
but this object will still exist when the DRM scheduler lands in the
i915. It will however look a bit different. It will encapsulate the
drm_gpu_scheduler object plus and common variables (to the backends)
related to scheduling. Regardless this is a step in the right direction.

This patch starts the aforementioned transition by moving the priolist
into the i915_sched_engine object.

v3:
(Jason Ekstrand)
Update comment next to intel_engine_cs.virtual
Add kernel doc
(Checkpatch)
Fix double the in commit message
v4:
(Daniele)
Update comment message.
Add comment about subclass field

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-2-matthew.brost@intel.com


# 932641f0 17-Jun-2021 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: extract steered reg access to common function

New steering cases will be added in the follow-up patches, so prepare a
common helper to avoid code duplication.

Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617211425.1943662-2-matthew.d.roper@intel.com


# b4ef9530 17-Jun-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Export the pinned context constructor and destructor

Allow internal clients to create and destroy a pinned context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-7-thomas.hellstrom@linux.intel.com


# 320ad343 01-Jul-2021 Thomas Zimmermann <tzimmermann@suse.de>

drm/i915: Use the correct IRQ during resume

The code in xcs_resume() probably didn't work as intended. It uses
struct drm_device.irq, which is allocated to 0, but never initialized
by i915 to the device's interrupt number.

Change all calls to synchronize_hardirq() to intel_synchronize_irq(),
which uses the correct interrupt. _hardirq() functions are not needed
in this context.

v5:
* go back to _hardirq() after PCI probe reported wrong
context; add rsp comment
v4:
* switch everything to intel_synchronize_irq() (Daniel)
v3:
* also use intel_synchronize_hardirq() at another callsite
v2:
* wrap irq code in intel_synchronize_hardirq() (Ville)

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 536f77b1caa0 ("drm/i915/gt: Call stop_ring() from ring resume, again")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210701173618.10718-2-tzimmermann@suse.de
(cherry picked from commit 27e4b467d94e216b365da388358c9407af818662)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# c816723b 05-Jun-2021 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER

This was done by the following semantic patch:

@@ expression i915; @@
- INTEL_GEN(i915)
+ GRAPHICS_VER(i915)

@@ expression i915; expression E; @@
- INTEL_GEN(i915) >= E
+ GRAPHICS_VER(i915) >= E

@@ expression dev_priv; expression E; @@
- !IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) != E

@@ expression dev_priv; expression E; @@
- IS_GEN(dev_priv, E)
+ GRAPHICS_VER(dev_priv) == E

@@
expression dev_priv;
expression from, until;
@@
- IS_GEN_RANGE(dev_priv, from, until)
+ IS_GRAPHICS_VER(dev_priv, from, until)

@def@
expression E;
identifier id =~ "^gen$";
@@
- id = GRAPHICS_VER(E)
+ ver = GRAPHICS_VER(E)

@@
identifier def.id;
@@
- id
+ ver

It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com


# 84bdf457 02-Jun-2021 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/guc: Use guc_class instead of engine_class in fw interface

GuC has its own defines for the engine classes. They're currently
mapping 1:1 to the defines used by the driver, but there is no guarantee
this will continue in the future. Given that we've been caught off-guard
in the past by similar divergences, we can prepare for the changes by
introducing helper functions to convert from engine class to GuC class and
back again.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210603051630.2635-21-matthew.brost@intel.com


# 0669a6e1 21-May-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move CS interrupt handler to the backend

The different submission backends each have their own preferred
behaviour and interrupt setup. Let each handle their own interrupts.

This becomes more useful later as we to extract the use of auxiliary
state in the interrupt handler that is backend specific.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210521183215.65451-4-matthew.brost@intel.com


# c92c36ed 21-May-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move submission_method into intel_gt

Since we setup the submission method for the engines once, it is easy to
assign an enum and use that instead of probing into the backends.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210521183215.65451-3-matthew.brost@intel.com


# e49a8b2c 07-May-2021 Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>

drm/i915/gt: Do release kernel context if breadcrumb measure fails

Commit fb5970da1b42 ("drm/i915/gt: Use the kernel_context to measure the
breadcrumb size") reordered some operations inside engine_init_common()
and added an error unwind path to that function. In that path, a
reference to a kernel context candidate supposed to be released on error
was put, but the context, pinned when created, was not unpinned first.
Fix it by replacing intel_context_put() with destroy_pinned_context()
introduced later by commit b436a5f8b6c8 ("drm/i915/gt: Track all timelines
created using the HWSP").

Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210507144251.376538-1-janusz.krzysztofik@linux.intel.com


# 1b9d8406 12-Apr-2021 Lucas De Marchi <lucas.demarchi@intel.com>

drm/i915/gt: replace gen use in intel_engine_cs

Start using the new fields graphics_version for the previous gen checks.
Here we rename the "gen" field and replace the comparisons using it to
start using the new GRAPHICS_VER(). Other uses of INTEL_GEN() were left
as is for automatic conversion later.

Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210413051002.92589-6-lucas.demarchi@intel.com


# 2da21daa 05-Feb-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Always flush the submission queue on checking for idle

We check for idle during debug prints and other debugging actions.
Simplify the flow by not touching execlists state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210205174358.28465-1-chris@chris-wilson.co.uk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 2913fa4d 26-Jan-2021 Emil Renner Berthing <kernel@esmil.dk>

drm/i915/gt: use new tasklet API for execution list

This converts the driver to use the new tasklet API introduced in
commit 12cc923f1ccc ("tasklet: Introduce new initialization API")

v2: Fix up selftests/execlists.

Signed-off-by: Emil Renner Berthing <kernel@esmil.dk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210126150155.1617-1-kernel@esmil.dk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# 24f90d66 22-Jan-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: SPDX cleanup

Clean up the SPDX licence declarations to comply with checkpatch.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-1-chris@chris-wilson.co.uk
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>


# fe83ce1e 23-Mar-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Handle ww locking in init_status_page

Try to pin to ggtt first, and use a full ww loop to handle
eviction correctly.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-21-maarten.lankhorst@linux.intel.com


# 12ca695d 23-Mar-2021 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Do not share hwsp across contexts any more, v8.

Instead of sharing pages with breadcrumbs, give each timeline a
single page. This allows unrelated timelines not to share locks
any more during command submission.

As an additional benefit, seqno wraparound no longer requires
i915_vma_pin, which means we no longer need to worry about a
potential -EDEADLK at a point where we are ready to submit.

Changes since v1:
- Fix erroneous i915_vma_acquire that should be a i915_vma_release (ickle).
- Extra check for completion in intel_read_hwsp().
Changes since v2:
- Fix inconsistent indent in hwsp_alloc() (kbuild)
- memset entire cacheline to 0.
Changes since v3:
- Do same in intel_timeline_reset_seqno(), and clflush for good measure.
Changes since v4:
- Use refcounting on timeline, instead of relying on i915_active.
- Fix waiting on kernel requests.
Changes since v5:
- Bump amount of slots to maximum (256), for best wraparounds.
- Add hwsp_offset to i915_request to fix potential wraparound hang.
- Ensure timeline wrap test works with the changes.
- Assign hwsp in intel_timeline_read_hwsp() within the rcu lock to
fix a hang.
Changes since v6:
- Rename i915_request_active_offset to i915_request_active_seqno(),
and elaborate the function. (tvrtko)
Changes since v7:
- Move hunk to where it belongs. (jekstrand)
- Replace CACHELINE_BYTES with TIMELINE_SEQNO_BYTES. (jekstrand)

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> #v1
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-2-maarten.lankhorst@linux.intel.com


# e322551f 28-Jan-2021 Thomas Zimmermann <tzimmermann@suse.de>

drm/i915/gt: Remove references to struct drm_device.pdev

Using struct drm_device.pdev is deprecated. Convert i915 to struct
drm_device.dev. No functional changes.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-3-tzimmermann@suse.de


# a829f033 02-Mar-2021 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Wedge the GPU if command parser setup fails

Commit 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing")
introduced mandatory command parsing but setup failures were not
translated into wedging the GPU which was probably the intent.

Possible errors come in two categories. Either the sanity check on
internal tables has failed, which should be caught in CI unless an
affected platform would be missed in testing; or memory allocation failure
happened during driver load, which should be extremely unlikely but for
correctness should still be handled.

v2:
* Tidy coding style. (Chris)

[airlied: cherry-picked to avoid rc1 base]
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 311a50e76a33 ("drm/i915: Add support for mandatory cmdparsing")
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Chris Wilson <chris.p.wilson@intel.com>
Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210302114213.1102223-1-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 5a1a659762d35a6dc51047c9127c011303c77b7f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>


# f530a41d 15-Jan-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Convert stats.active to plain unsigned int

As context-in/out is now always serialised, we do not have to worry
about concurrent enabling/disable of the busy-stats and can reduce the
atomic_t active to a plain unsigned int, and the seqlock to a seqcount.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-3-chris@chris-wilson.co.uk


# 2c421896 15-Jan-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Drop atomic for engine->fw_active tracking

Since schedule-in/out is now entirely serialised by the tasklet bitlock,
we do not need to worry about concurrent in/out operations and so reduce
the atomic operations to plain instructions.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-1-chris@chris-wilson.co.uk


# 163433e5 14-Jan-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark up protected uses of 'i915_request_completed'

When we know that we are inside the timeline mutex, or inside the
submission flow (under active.lock or the holder's rcu lock), we know
that the rq->hwsp is stable and we can use the simpler direct version.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114135612.13210-1-chris@chris-wilson.co.uk


# 64362bc6 13-Jan-2021 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Replace open-coded intel_engine_stop_cs()

In the legacy ringbuffer submission, we still had an open-coded version
of intel_engine_stop_cs() with one additional verification step. Transfer
that verification to intel_engine_stop_cs() itself, and call it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113204709.15020-1-chris@chris-wilson.co.uk


# 007c4578 12-Jan-2021 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/guc: stop calling execlists_set_default_submission

Initialize all required entries from guc_set_default_submission, instead
of calling the execlists function. The previously inherited setup has
been copied over from the execlist code and simplified by removing the
execlists submission-specific parts.

v2: move setting of relative_mmio flag to engine_setup_common (Chris)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-5-daniele.ceraolospurio@intel.com


# 43aaadc6 12-Jan-2021 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/guc: init engine directly in GuC submission mode

Instead of starting the engine in execlists submission mode and then
switching to GuC, start directly in GuC submission mode. The initial
setup functions have been copied over from the execlists code
and simplified by removing the execlists submission-specific parts.

v2: remove unneeded unexpected starting state check (Chris)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <john.c.harrison@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-4-daniele.ceraolospurio@intel.com


# d0637f7a 12-Jan-2021 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/guc: do not dump execlists state with GuC submission

GuC owns the execlists state and the context IDs used for submission, so
the status of the ports and the CSB entries are not something we control
or can decode from the i915 side, therefore we can avoid dumping it. A
follow-up patch will also stop setting the csb pointers when using GuC
submission.

GuC dumps all the required events in the GuC logs when verbosity is set
high enough.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20210113021236.8164-3-daniele.ceraolospurio@intel.com


# 16f2941a 24-Dec-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Replace direct submit with direct call to tasklet

Rather than having special case code for opportunistically calling
process_csb() and performing a direct submit while holding the engine
spinlock for submitting the request, simply call the tasklet directly.
This allows us to retain the direct submission path, including the CS
draining to allow fast/immediate submissions, without requiring any
duplicated code paths, and most importantly greatly simplifying the
control flow by removing reentrancy. This will enable us to close a few
races in the virtual engines in the next few patches.

The trickiest part here is to ensure that paired operations (such as
schedule_in/schedule_out) remain under consistent locking domains,
e.g. when pulled outside of the engine->active.lock

v2: Use bh kicking, see commit 3c53776e29f8 ("Mark HI and TASKLET
softirq synchronous").
v3: Update engine-reset to be tasklet aware

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk


# b436a5f8 22-Dec-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Track all timelines created using the HWSP

We assume that the contents of the HWSP are lost across suspend, and so
upon resume we must restore critical values such as the timeline seqno.
Keep track of every timeline allocated that uses the HWSP as its storage
and so we can then reset all seqno values by walking that list.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201222104242.10993-1-chris@chris-wilson.co.uk


# 5ec17c76 20-Dec-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Another tweak for flushing the tasklets

tasklet_kill() ensures that we _yield_ the processor until a remote
tasklet is completed. However, this leads to a starvation condition as
being at the bottom of the scheduler's runqueue means that anything else
is able to run, including all hogs keeping the tasklet occupied.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201220134858.10510-1-chris@chris-wilson.co.uk


# 70a2b431 09-Dec-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Rename lrc.c to execlists_submission.c

We want to separate the utility functions for controlling the logical
ring context from the execlists submission mechanism (which is an
overgrown scheduler).

This is similar to Daniele's work to split up the files, but being
selfish I wanted to base it after my own changes to intel_lrc.c petered
out.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-2-chris@chris-wilson.co.uk


# 9fd96c06 09-Dec-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move move context layout registers and offsets to lrc_reg.h

Cleanup intel_lrc.h by moving some of the residual common register
definitions into intel_lrc_reg.h, prior to rebranding and splitting off
the submission backends.

v2: keep the SCHEDULE enum in the old file, since it is specific to the
gvt usage of the execlists submission backend (John)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Cc: John Harrison <John.C.Harrison@Intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201209233618.4287-1-chris@chris-wilson.co.uk


# 562675d0 19-Nov-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Update request status flags for debug pretty-printer

We plan to expand upon the number of available statuses for when we
pretty-print the requests along the timelines, and so need a new set of
flags. We have settled upon:

Unready [U]
- initial status after being submitted, the request is not
ready for execution as it is waiting for external fences

Ready [R]
- all fences the request was waiting on have been signaled,
and the request is now ready for execution and will be
in a backend queue

- a ready request may still need to wait on semaphores
[internal fences]

Ready/virtual [V]
- same as ready, but queued over multiple backends

Executing [E]
- the request has been transferred from the backend queue and
submitted for execution on HW

- a completed request may still be regarded as executing, its
status may not be updated until it is retired and removed
from the lists

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-3-chris@chris-wilson.co.uk


# 1f0e785a 19-Nov-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Lift i915_request_show()

Extract i915_request_show for reuse in other request chain pretty
printers.

For a bonus point, quietly change the seqno format from %llx to %lld to
match everywhere else.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-2-chris@chris-wilson.co.uk


# 14cb9a77 19-Nov-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Include semaphore status in print_request()

When pretty-printing the requests for debug, also show the status of any
semaphore waits as part of its runnable status.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201119165616.10834-1-chris@chris-wilson.co.uk


# 5ce6861d 05-Nov-2020 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

drm/i915: Correctly set SFC capability for video engines

SFC capability of video engines is not set correctly because i915
is testing for incorrect bits.

Fixes: c5d3e39caa45 ("drm/i915: Engine discovery query")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniele.ceraolospurio@intel.com
(cherry picked from commit ad18fa0f5f052046cad96fee762b5c64f42dd86a)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# ad18fa0f 05-Nov-2020 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

drm/i915: Correctly set SFC capability for video engines

SFC capability of video engines is not set correctly because i915
is testing for incorrect bits.

Fixes: c5d3e39caa45 ("drm/i915: Engine discovery query")
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: <stable@vger.kernel.org> # v5.3+
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201106011842.36203-1-daniele.ceraolospurio@intel.com


# c784e524 28-Oct-2020 John Harrison <John.C.Harrison@Intel.com>

drm/i915/guc: Update to use firmware v49.0.1

The latest GuC firmware includes a number of interface changes that
require driver updates to match.

* Starting from Gen11, the ID to be provided to GuC needs to contain
the engine class in bits [0..2] and the instance in bits [3..6].

NOTE: this patch breaks pointer dereferences in some existing GuC
functions that use the guc_id to dereference arrays but these functions
are not used for now as we have GuC submission disabled and we will
update these functions in follow up patch which requires new IDs.

* The new GuC requires the additional data structure (ADS) and associated
'private_data' pointer to be setup. This is basically a scratch area
of memory that the GuC owns. The size is read from the CSS header.

* There is now a physical to logical engine mapping table in the ADS
which needs to be configured in order for the firmware to load. For
now, the table is initialised with a 1 to 1 mapping.

* GUC_CTL_CTXINFO has been removed from the initialization params.

* reg_state_buffer is maintained internally by the GuC as part of
the private data.

* The ADS layout has changed significantly. This patch updates the
shared structure and also adds better documentation of the layout.

* While i915 does not use GuC doorbells, the firmware now requires
that some initialisation is done.

* The number of engine classes and instances supported in the ADS has
been increased.

Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Michel Thierry <michel.thierry@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Michal Winiarski <michal.winiarski@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201028145826.2949180-2-John.C.Harrison@Intel.com


# 0bda4b80 16-Sep-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Show engine properties in the pretty printer

When debugging the engine state, include the user properties that may,
or may not, have been adjusted by the user/test.

For example,
vecs0
...
Properties:
heartbeat_interval_ms: 2500 [default 2500]
max_busywait_duration_ns: 8000 [default 8000]
preempt_timeout_ms: 640 [default 640]
stop_timeout_ms: 100 [default 100]
timeslice_duration_ms: 1 [default 1]

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200916090059.3189-1-chris@chris-wilson.co.uk


# 47b08693 19-Aug-2020 Maarten Lankhorst <maarten.lankhorst@linux.intel.com>

drm/i915: Make sure execbuffer always passes ww state to i915_vma_pin.

As a preparation step for full object locking and wait/wound handling
during pin and object mapping, ensure that we always pass the ww context
in i915_gem_execbuffer.c to i915_vma_pin, use lockdep to ensure this
happens.

This also requires changing the order of eb_parse slightly, to ensure
we pass ww at a point where we could still handle -EDEADLK safely.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200819140904.1708856-15-maarten.lankhorst@linux.intel.com
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# b3786b29 31-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs

On the virtual engines, we only use the intel_breadcrumbs for tracking
signaling of stale breadcrumbs from the irq_workers. They do not have
any associated interrupt handling, active requests are passed to a
physical engine and associated breadcrumb interrupt handler. This causes
issues for us as we need to ensure that we do not actually try and
enable interrupts and the powermanagement required for them on the
virtual engine, as they will never be disabled. Instead, let's
specify the physical engine used for interrupt handler on a particular
breadcrumb.

v2: Drop b->irq_armed = true mocking for no interrupt HW

Fixes: 4fe6abb8f513 ("drm/i915/gt: Ignore irq enabling on the virtual engines")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# d1bf5dd8 30-Jul-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Support multiple pinned timelines

We may need to allocate more than one pinned context/timeline for each
engine which can utilise the per-engine HWSP, so we need to give each
a different offset within it.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200730183906.25422-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# df561f66 23-Aug-2020 Gustavo A. R. Silva <gustavoars@kernel.org>

treewide: Use fallthrough pseudo-keyword

Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>


# 0b6613c6 07-Jul-2020 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

drm/i915/sseu: Move sseu_info under gt_info

SSEUs are a GT capability, so track them under gt_info.

Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-8-daniele.ceraolospurio@intel.com


# 792592e7 07-Jul-2020 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: Move the engine mask to intel_gt_info

Since the engines belong to the GT, move the runtime-updated list of
available engines to the intel_gt struct. The original mask has been
renamed to indicate it contains the maximum engine list that can be
found on a matching device.

In preparation for other info being moved to the gt in follow up patches
(sseu), introduce an intel_gt_info structure to group all gt-related
runtime info.

v2: s/max_engine_mask/platform_engine_mask (tvrtko), fix selftest

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-5-daniele.ceraolospurio@intel.com


# f6beb381 07-Jul-2020 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: Move engine-related mmio init to engines_init_mmio

All the info we read in intel_device_info_init_mmio are engine-related
and since we already have an engine_init_mmio function we can just
perform the operations from there.

v2: clarify comment about forcewake requirements and pruning (Chris)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-4-daniele.ceraolospurio@intel.com


# 242613af 07-Jul-2020 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: Use the gt in HAS_ENGINE

A follow up patch will move the engine mask under the gt structure,
so get ready for that.

v2: switch the remaining gvt case using dev_priv->gt to gvt->gt (Chris)

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200708003952.21831-3-daniele.ceraolospurio@intel.com


# 4fb33953 19-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Show the culmative runtime as part of the engine info

Since we always enable the busy-stats, the culmulative runtime should be
accurate, and might be useful for diagnosing issues with the engine.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200619191053.9654-1-chris@chris-wilson.co.uk


# 810b7ee3 17-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Always report the sample time for busy-stats

Return the monotonic timestamp (ktime_get()) at the time of sampling the
busy-time. This is used in preference to taking ktime_get() separately
before or after the read seqlock as there can be some large variance in
reported timestamps. For selftests trying to ascertain that we are
reporting accurate to within a few microseconds, even a small delay
leads to the test failing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-2-chris@chris-wilson.co.uk


# 8ab3a381 09-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Incrementally check for rewinding

In commit 5ba32c7be81e ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.

However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.

Fixes: 5ba32c7be81e ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk
(cherry picked from commit e36ba817fa966f81fb1c8d16f3721b5a644b2fa9)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 570af07d 15-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Don't flush the tasklet if not setup

If the tasklet is not being used, don't try and flush it.

Fixes: 594893870044 ("drm/i915/gt: Add a safety submission flush in the heartbeat")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200615183935.17389-1-chris@chris-wilson.co.uk


# 59489387 15-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Add a safety submission flush in the heartbeat

Just in case everything fails (like for example "missed interrupt
syndrome" on Sandybridge), always flush the submission tasklet from the
heartbeat. This papers over such issues, but will still appear as a
second long glitch, and prevents us from detecting it unless we happen
to be performing a timed test.

v2: We rely on flush_submission() synchronizing with the tasklet on
another CPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200615165013.22973-3-chris@chris-wilson.co.uk


# 3e48e836 10-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Include context status in debug dumps

This may be useful to identify contexts that are running even though
they are supposed to be closed or banned.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200610154046.22449-1-chris@chris-wilson.co.uk


# e36ba817 09-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Incrementally check for rewinding

In commit 5ba32c7be81e ("drm/i915/execlists: Always force a context
reload when rewinding RING_TAIL"), we placed the check for rewinding a
context on actually submitting the next request in that context. This
was so that we only had to check once, and could do so with precision
avoiding as many forced restores as possible. For example, to ensure
that we can resubmit the same request a couple of times, we include a
small wa_tail such that on the next submission, the ring->tail will
appear to move forwards when resubmitting the same request. This is very
common as it will happen for every lite-restore to fill the second port
after a context switch.

However, intel_ring_direction() is limited in precision to movements of
upto half the ring size. The consequence being that if we tried to
unwind many requests, we could exceed half the ring and flip the sense
of the direction, so missing a force restore. As no request can be
greater than half the ring (i.e. 2048 bytes in the smallest case), we
can check for rollback incrementally. As we check against the tail that
would be submitted, we do not lose any sensitivity and allow lite
restores for the simple case. We still need to double check upon
submitting the context, to allow for multiple preemptions and
resubmissions.

Fixes: 5ba32c7be81e ("drm/i915/execlists: Always force a context reload when rewinding RING_TAIL")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.4+
Reviewed-by: Bruce Chang <yu.bruce.chang@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200609151723.12971-1-chris@chris-wilson.co.uk


# ac4fc5b3 05-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Include the engine's fw-domains in the debug info

Add engine->fw_domain/active to the pretty printer for debug dumps and
debugfs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200605144705.31127-1-chris@chris-wilson.co.uk


# 5a833995 02-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop i915_request.i915 backpointer

We infrequently use the direct i915 backpointer from the i915_request,
so do we really need to waste the space in the struct for it? 8 bytes
from the most frequently allocated struct vs an 3 bytes and pointer
chasing in using rq->engine->i915?

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk


# 0b0b2549 01-Jun-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Handle very early engine initialisation failure

If we fail during engine setup, we may leave some engines not yet setup.
During the error cleanup, we have to be careful not to try and use the
uninitialise engines before discarding them.

[ 16.136152] RIP: 0010:__flush_work+0x198/0x1b0
[ 16.136168] Code: ff ff 8b 0b 48 8b 53 08 83 e1 08 48 0f ba 2b 03 80 c9 f0 e9 63 ff ff ff 0f 0b 48 83 c4 48 44 89 f0 5b 5d 41 5c 41 5d 41 5e c3 <0f> 0b 45 31 f6 e9 62 ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 0f
[ 16.136186] RSP: 0018:ffffc900003bb928 EFLAGS: 00010246
[ 16.136201] RAX: 0000000000000000 RBX: ffff88844f392168 RCX: 0000000000000000
[ 16.136216] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88844f392168
[ 16.136231] RBP: ffff88844f392130 R08: 0000000000000000 R09: 0000000000000001
[ 16.136246] R10: ffff888441e31e40 R11: ffff88845e329c70 R12: ffff88844f796988
[ 16.136261] R13: ffff888441e4fb80 R14: 0000000000000001 R15: ffff88844f790000
[ 16.136388] FS: 00007fecbd208880(0000) GS:ffff88845e380000(0000) knlGS:0000000000000000
[ 16.136405] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 16.136420] CR2: 00007ff3ce748f90 CR3: 0000000457a6a001 CR4: 00000000000606e0
[ 16.136437] Call Trace:
[ 16.136456] ? try_to_del_timer_sync+0x3a/0x50
[ 16.136529] intel_wakeref_wait_for_idle+0x87/0xb0 [i915]
[ 16.136606] ? intel_engines_release+0x68/0xc0 [i915]
[ 16.136680] intel_engines_release+0x49/0xc0 [i915]
[ 16.136757] intel_gt_init+0x2f4/0x5e0 [i915]

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200601072446.19548-1-chris@chris-wilson.co.uk


# 7a0ba6b4 14-May-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Show per-engine default property values in sysfs

By providing the default values configured into the kernel via sysfs, it
is much more convenient for userspace to restore those sane defaults, or
at least know what are considered good baseline. This is useful, for
example, to cleanup after any failed userspace prior to commencing new
jobs.

/sys/class/drm/card0/engine/rcs0/
├── capabilities
├── class
├── .defaults
│   ├── heartbeat_interval_ms
│   ├── max_busywait_duration_ns
│   ├── preempt_timeout_ms
│   ├── stop_timeout_ms
│   └── timeslice_duration_ms
├── heartbeat_interval_ms
├── instance
├── known_capabilities
├── max_busywait_duration_ns
├── mmio_base
├── name
├── preempt_timeout_ms
├── stop_timeout_ms
└── timeslice_duration_ms

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200514062905.28668-1-chris@chris-wilson.co.uk


# 220dcfc1 07-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore

If we find ourselves waiting on a MI_SEMAPHORE_WAIT, either within the
user batch or in our own preamble, the engine raises a
GT_WAIT_ON_SEMAPHORE interrupt. We can unmask that interrupt and so
respond to a semaphore wait by yielding the timeslice, if we have
another context to yield to!

The only real complication is that the interrupt is only generated for
the start of the semaphore wait, and is asynchronous to our
process_csb() -- that is, we may not have registered the timeslice before
we see the interrupt. To ensure we don't miss a potential semaphore
blocking forward progress (e.g. selftests/live_timeslice_preempt) we mark
the interrupt and apply it to the next timeslice regardless of whether it
was active at the time.

v2: We use semaphores in preempt-to-busy, within the timeslicing
implementation itself! Ergo, when we do insert a preemption due to an
expired timeslice, the new context may start with the missed semaphore
flagged by the retired context and be yielded, ad infinitum. To avoid
this, read the context id at the time of the semaphore interrupt and
only yield if that context is still active.

Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200407130811.17321-1-chris@chris-wilson.co.uk
(cherry picked from commit c4e8ba7390346a77ffe33ec3f210bc62e0b6c8c6)
(cherry picked from commit cd60e4ac4738a6921592c4f7baf87f9a3499f0e2)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# 16e87459 29-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move the batch buffer pool from the engine to the gt

Since the introduction of 'soft-rc6', we aim to park the device quickly
and that results in frequent idling of the whole device. Currently upon
idling we free the batch buffer pool, and so this renders the cache
ineffective for many workloads. If we want to have an effective cache of
recently allocated buffers available for reuse, we need to decouple that
cache from the engine powermanagement and make it timer based. As there
is no reason then to keep it within the engine (where it once made
retirement order easier to track), we can move it up the hierarchy to the
owner of the memory allocations.

v2: Hook up to debugfs/drop_caches to clear the cache on demand.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk


# 426d0073 29-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Always enable busy-stats for execlists

In the near future, we will utilize the busy-stats on each engine to
approximate the C0 cycles of each, and use that as an input to a manual
RPS mechanism. That entails having busy-stats always enabled and so we
can remove the enable/disable routines and simplify the pmu setup. As a
consequence of always having the stats enabled, we can also show the
current active time via sysfs/engine/xcs/active_time_ns.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk


# be1cb55a 29-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Keep a no-frills swappable copy of the default context state

We need to keep the default context state around to instantiate new
contexts (aka golden rendercontext), and we also keep it pinned while
the engine is active so that we can quickly reset a hanging context.
However, the default contexts are large enough to merit keeping in
swappable memory as opposed to kernel memory, so we store them inside
shmemfs. Currently, we use the normal GEM objects to create the default
context image, but we can throw away all but the shmemfs file.

This greatly simplifies the tricky power management code which wants to
run underneath the normal GT locking, and we definitely do not want to
use any high level objects that may appear to recurse back into the GT.
Though perhaps the primary advantage of the complex GEM object is that
we aggressively cache the mapping, but here we are recreating the
vm_area everytime time we unpark. At the worst, we add a lightweight
cache, but first find a microbenchmark that is impacted.

Having started to create some utility functions to make working with
shmemfs objects easier, we can start putting them to wider use, where
GEM objects are overkill, such as storing persistent error state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld@intel.com>
Cc: Ramalingam C <ramalingam.c@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk


# 2632f174 28-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/execlists: Avoid reusing the same logical CCID

The bspec is confusing on the nature of the upper 32bits of the LRC
descriptor. Once upon a time, it said that it uses the upper 32b to
decide if it should perform a lite-restore, and so we must ensure that
each unique context submitted to HW is given a unique CCID [for the
duration of it being on the HW]. Currently, this is achieved by using
a small circular tag, and assigning every context submitted to HW a
new id. However, this tag is being cleared on repinning an inflight
context such that we end up re-using the 0 tag for multiple contexts.

To avoid accidentally clearing the CCID in the upper 32bits of the LRC
descriptor, split the descriptor into two dwords so we can update the
GGTT address separately from the CCID.

Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796
Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: <stable@vger.kernel.org> # v5.5+
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk


# c4e8ba73 07-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore

If we find ourselves waiting on a MI_SEMAPHORE_WAIT, either within the
user batch or in our own preamble, the engine raises a
GT_WAIT_ON_SEMAPHORE interrupt. We can unmask that interrupt and so
respond to a semaphore wait by yielding the timeslice, if we have
another context to yield to!

The only real complication is that the interrupt is only generated for
the start of the semaphore wait, and is asynchronous to our
process_csb() -- that is, we may not have registered the timeslice before
we see the interrupt. To ensure we don't miss a potential semaphore
blocking forward progress (e.g. selftests/live_timeslice_preempt) we mark
the interrupt and apply it to the next timeslice regardless of whether it
was active at the time.

v2: We use semaphores in preempt-to-busy, within the timeslicing
implementation itself! Ergo, when we do insert a preemption due to an
expired timeslice, the new context may start with the missed semaphore
flagged by the retired context and be yielded, ad infinitum. To avoid
this, read the context id at the time of the semaphore interrupt and
only yield if that context is still active.

Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200407130811.17321-1-chris@chris-wilson.co.uk


# 848862e6 03-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Free request pool from virtual engines

While extremely unlikely to be populated, we could capture a request on
the virtual engine which we should free along with the virtual engine.

Fixes: 43acd6516ca9 ("drm/i915: Keep a per-engine request pool")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200403203303.10903-1-chris@chris-wilson.co.uk


# 43acd651 02-Apr-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Keep a per-engine request pool

Add a tiny per-engine request mempool so that we should always have a
request available for powermanagement allocations from tricky
contexts. This reserve is expected to be only used for kernel
contexts when barriers must be emitted [almost] without fail.

The main consumer for this reserved request is expected to be engine-pm,
for which we know that there will always be at least the previous pm
request that we can reuse under mempressure (so there should always be
a spare request for engine_park()).

This is an alternative to using a comparatively bulky mempool, which
requires custom handling for both our reserved allocation requirement
and to protect our TYPESAFE_BY_RCU slab cache. The advantage of mempool
would be that it would allow us to keep a larger per-engine request
pool. However, converting over to mempool is straightforward should the
need arise.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-and-tested-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402184037.21630-1-chris@chris-wilson.co.uk


# a5572d1f 31-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Align engine dump active/pending

Insert a space so that the same fields between active/pending execlists
state are aligned.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200401111554.6279-1-chris@chris-wilson.co.uk


# 60672784 31-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Include the execlists CCID of each port in the engine dump

Since we print out EXECLISTS_STATUS in the dump, also print out the CCID
of each context so we can cross check between the two.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200331094239.23145-1-chris@chris-wilson.co.uk


# 73c8bfb7 25-Mar-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop final few uses of drm_i915_private.engine

We've migrated all the heavy users over to the intel_gt, and can finally
drop the last few users and with that the mirror in dev_priv->engine[].

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200325234803.6175-1-chris@chris-wilson.co.uk


# 07bcfd12 12-Mar-2020 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915/gen12: Disable preemption timeout

Allow super long OpenCL workloads which cannot be preempted within
the default timeout to run out of the box.

v2:
* Make it stick out more and apply only to RCS. (Chris)

v3:
* Mention platform override in kconfig. (Joonas)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Michal Mrozek <michal.mrozek@intel.com>
Cc: <stable@vger.kernel.org> # v5.6+
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Michal Mrozek <Michal.mrozek@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200312115748.29970-1-tvrtko.ursulin@linux.intel.com


# 61f874d6 11-Mar-2020 Takashi Iwai <tiwai@suse.de>

drm/i915/gt: Use scnprintf() for avoiding potential buffer overflow

Since snprintf() returns the would-be-output size instead of the
actual output size, the succeeding calls may go beyond the given
buffer limit. Fix it by replacing with scnprintf().

Signed-off-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200311073256.6535-1-tiwai@suse.de


# 062444bb 28-Feb-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Expose busywait duration to sysfs

We busywait on an inflight request (one that is currently executing on
HW, and so might complete quickly) prior to setting up an interrupt and
sleeping. The trade off is that we keep an expensive CPU core busy in
order to avoid wake up latency: where that trade off should lie is best
left to the sysadmin.

The busywait mechanism can be compiled out with

./scripts/config --set-val DRM_I915_SPIN_REQUEST 0

The maximum busywait duration can be adjusted per-engine using,

/sys/class/drm/card?/engine/*/ms_busywait_duration_ns

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Steve Carbonari <steven.carbonari@intel.com>
Tested-by: Steve Carbonari <steven.carbonari@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-4-chris@chris-wilson.co.uk


# 5d8b1341 22-Feb-2020 Colin Ian King <colin.king@canonical.com>

drm/i915/gt: remove redundant assignment to variable dw

Variable dw is being initialized with a value that is never read,
it is assigned a new value later on. The assignment is redundant
and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200222134755.134209-1-colin.king@canonical.com


# 489645d5 18-Feb-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Show the cumulative context runtime in engine debug

As we have the total runtime known to us, show it when dumping the
engine state for debug.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200218162150.1300405-2-chris@chris-wilson.co.uk


# c3f1ed90 16-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Allow temporary suspension of inflight requests

In order to support out-of-line error capture, we need to remove the
active request from HW and put it to one side while a worker compresses
and stores all the details associated with that request. (As that
compression may take an arbitrary user-controlled amount of time, we
want to let the engine continue running on other workloads while the
hanging request is dumped.) Not only do we need to remove the active
request, but we also have to remove its context and all requests that
were dependent on it (both in flight, queued and future submission).

Finally once the capture is complete, we need to be able to resubmit the
request and its dependents and allow them to execute.

v2: Replace stack recursion with a simple list.
v3: Check all the parents, not just the first, when searching for a
stuck ancestor!

References: https://gitlab.freedesktop.org/drm/intel/issues/738
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk
(cherry picked from commit 32ff621fd74496f0c33644125fb69ff175859b1f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# fb5970da 06-Feb-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Use the kernel_context to measure the breadcrumb size

We set up a dummy ring in order to measure the size we require for our
breadcrumb emission, so that we don't have to manually count dwords! We
can pass in the kernel_context to use for this so that if required it is
known for the breadcrumb emitter, and we can reuse some details from the
kernel_context to reduce the number of temporaries we have to mock.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200207125827.2787472-1-chris@chris-wilson.co.uk


# faea1792 31-Jan-2020 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: extract engine WA programming to common resume function

The workarounds are a common "feature" across gens and submission
mechanisms and we already call the other WA related functions from
common engine ones (<setup/cleanup>_common), so it makes sense to
do the same with WA application. Medium-term, This will help us
reduce the duplication once the GuC resume function is added, but short
term it will also allow us to use the workaround lists for pre-gen8
engine workarounds.

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200131075716.2212299-2-chris@chris-wilson.co.uk


# e3793468 30-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Use the async worker to avoid reclaim tainting the ggtt->mutex

On Braswell and Broxton (also known as Valleyview and Apollolake), we
need to serialise updates of the GGTT using the big stop_machine()
hammer. This has the side effect of appearing to lockdep as a possible
reclaim (since it uses the cpuhp mutex and that is tainted by per-cpu
allocations). However, we want to use vm->mutex (including ggtt->mutex)
from within the shrinker and so must avoid such possible taints. For this
purpose, we introduced the asynchronous vma binding and we can apply it
to the PIN_GLOBAL so long as take care to add the necessary waits for
the worker afterwards.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/211
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200130181710.2030251-3-chris@chris-wilson.co.uk


# ce016437 28-Jan-2020 Wambui Karuga <wambui.karugax@gmail.com>

drm/i915/engine_cs: use new drm logging macros in gt/intel_engine_cs.c

Conversion of the remaining printk based drm logging macros to the new
struct drm_device based logging macros in i915/gt/intel_engine_cs.c.

Signed-off-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128071437.9284-5-wambui.karugax@gmail.com


# f7043102 29-Jan-2020 Lionel Landwerlin <lionel.g.landwerlin@intel.com>

drm/i915: add extra slice common debug registers

Could be helpful for debugging purposes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200129181638.1528150-1-lionel.g.landwerlin@intel.com


# 70a76a9b 28-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Hook up CS_MASTER_ERROR_INTERRUPT

Now that we have offline error capture and can reset an engine from
inside an atomic context while also preserving the GPU state for
post-mortem analysis, it is time to handle error interrupts thrown by
the command parser.

This provides a much, much faster mechanism for us to detect known
problems than using heartbeats/hangchecks, and also provides a mechanism
for when those are disabled. However, it is limited to problems the HW
can detect in the CS and so not a complete solution for detecting lockups.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200128204318.4182039-2-chris@chris-wilson.co.uk


# 0ea60c1d 24-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Flush engine parking before release

Due to the asynchronous nature of releasing our wakerefs, we can signal
the main GT wakeref as complete before the individual engines have
cleared their own wakerefs. During shutdown we assert that the engines
are indeed parked before we release them, but for this to be always true
we need to flush their workers as well.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200124143339.140988-1-chris@chris-wilson.co.uk


# 0d4c351a 14-Jan-2020 Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>

drm/i915/gt: Make WARN* drm specific where drm_priv ptr is available

drm specific WARN* calls include device information in the
backtrace, so we know what device the warnings originate from.

Covert all the calls of WARN* with device specific drm_WARN*
variants in functions where drm_i915_private struct pointer is readily
available.

The conversion was done automatically with below coccinelle semantic
patch. checkpatch errors/warnings are fixed manually.

@rule1@
identifier func, T;
@@
func(...) {
...
struct drm_i915_private *T = ...;
<+...
(
-WARN(
+drm_WARN(&T->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&T->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&T->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&T->drm,
...)
)
...+>
}

@rule2@
identifier func, T;
@@
func(struct drm_i915_private *T,...) {
<+...
(
-WARN(
+drm_WARN(&T->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&T->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&T->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&T->drm,
...)
)
...+>
}

command: spatch --sp-file <script> --dir drivers/gpu/drm/i915/gt \
--linux-spacing --in-place

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200115034455.17658-7-pankaj.laxminarayan.bharadiya@intel.com


# cd699527 17-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Be paranoid and reset the GPU before release

Just in the very unlikely case we have not stopped the GPU before we
return the pages being used by the GPU to the system, force a reset.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117180309.3249427-1-chris@chris-wilson.co.uk


# 94523024 17-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Report the currently active execlists request

Since commit 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy"), we
prune the engine->active.requests list prior to preemption, thus
removing the trace of the currently executing request. If that request
hangs rather than be preempted, we conclude that no active request was
on the GPU. Fortunately, this only impacts our debugging, and not our
means of hang detection or recovery.

v2: Use from to check the current iterator before continuing, and report
active as NULL if the current request is already completed.

References: 22b7a426bbe1 ("drm/i915/execlists: Preempt-to-busy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200117113259.3023890-1-chris@chris-wilson.co.uk


# 32ff621f 16-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Allow temporary suspension of inflight requests

In order to support out-of-line error capture, we need to remove the
active request from HW and put it to one side while a worker compresses
and stores all the details associated with that request. (As that
compression may take an arbitrary user-controlled amount of time, we
want to let the engine continue running on other workloads while the
hanging request is dumped.) Not only do we need to remove the active
request, but we also have to remove its context and all requests that
were dependent on it (both in flight, queued and future submission).

Finally once the capture is complete, we need to be able to resubmit the
request and its dependents and allow them to execute.

v2: Replace stack recursion with a simple list.
v3: Check all the parents, not just the first, when searching for a
stuck ancestor!

References: https://gitlab.freedesktop.org/drm/intel/issues/738
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk


# 742379c0 09-Jan-2020 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Start chopping up the GPU error capture

In the near future, we will want to start a GPU error capture from a new
context, from inside the softirq region of a forced preemption. To do
so requires us to break up the monolithic error capture to provide new
entry points with finer control; in particular focusing on one
engine/gt, and being able to compose an error state from little pieces
of HW capture.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110123059.1348712-1-chris@chris-wilson.co.uk


# 30084b14 23-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Flush other retirees inside intel_gt_retire_requests()

Our goal in wait_for_idle (intel_gt_retire_requests) is to the current
workload *and* their idle barriers. This requires us to notice the late
arrival of those, which is done by inspecting the list of active
timelines. However, if a concurrent retirer is running that new timeline
may not be added until after we drop the lock -- so flush concurrent
retirers before we take the lock and inspect the list.

Closes: https://gitlab.freedesktop.org/drm/intel/issues/878
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191223211008.2371613-1-chris@chris-wilson.co.uk


# 7d70a123 22-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Merge engine init/setup loops

Now that we don't need to create GEM contexts in the middle of engine
construction, we can pull the engine init/setup loops together.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222144046.1674865-2-chris@chris-wilson.co.uk


# e26b6d43 21-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Pull GT initialisation under intel_gt_init()

Begin pulling the GT setup underneath a single GT umbrella; let intel_gt
take ownership of its engines! As hinted, the complication is the
lifetime of the probed engine versus the active lifetime of the GT
backends. We need to detect the engine layout early and keep it until
the end so that we can sanitize state on takeover and release.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191222120752.1368352-1-chris@chris-wilson.co.uk


# 4856254d 21-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Repeat wait_for_idle for retirement workers

Since we may retire timelines from secondary workers,
intel_gt_retire_requests() is not always a reliable indicator that all
pending retirements are complete. If we do detect secondary workers are
in progress, recommend intel_gt_wait_for_idle() to repeat the retirement
check.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221180204.1201217-1-chris@chris-wilson.co.uk


# e6ba7648 21-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Remove i915->kernel_context

Allocate only an internal intel_context for the kernel_context, forgoing
a global GEM context for internal use as we only require a separate
address space (for our own protection).

Now having weaned GT from requiring ce->gem_context, we can stop
referencing it entirely. This also means we no longer have to create random
and unnecessary GEM contexts for internal use.

GEM contexts are now entirely for tracking GEM clients, and intel_context
the execution environment on the GPU.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191221160324.1073045-1-chris@chris-wilson.co.uk


# 9f3ccd40 20-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop GEM context as a direct link from i915_request

Keep the intel_context as being the primary state for i915_request, with
the GEM context a backpointer from the low level state for the rarer
cases we need client information. Our goal is to remove such references
to clients from the backend, and leave the HW submission agnostic to
client interfaces and self-contained.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191220101230.256839-1-chris@chris-wilson.co.uk


# b81e4d9b 18-Dec-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Track engine round-trip times

Knowing the round trip time of an engine is useful for tracking the
health of the system as well as providing a metric for the baseline
responsiveness of the engine. We can use the latter metric for
automatically tuning our waits in selftests and when idling so we don't
confuse a slower system with a dead one.

Upon idling the engine, we send one last pulse to switch the context
away from precious user state to the volatile kernel context. We know
the engine is idle at this point, and the pulse is non-preemptible, so
this provides us with a good measurement of the round trip time. It also
provides us with faster engine parking for ringbuffer submission, which
is a welcome bonus (e.g. softer-rc6).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191219105043.4169050-1-chris@chris-wilson.co.uk
Link: https://patchwork.freedesktop.org/patch/msgid/20191219124353.8607-2-chris@chris-wilson.co.uk


# 639f2f24 13-Dec-2019 Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>

drm/i915: Introduce new macros for tracing

New macros ENGINE_TRACE(), CE_TRACE(), RQ_TRACE() and
GT_TRACE() are introduce to tag device name and engine
name with contexts and requests tracing in i915.

Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213155152.69182-2-venkata.s.dhanalakota@intel.com


# 92c964ca 05-Dec-2019 Andi Shyti <andi.shyti@intel.com>

drm/i915/gt: Replace I915_READ with intel_uncore_read

Get rid of the last remaining I915_READ in gt/ and make gt-land
the first I915_READ-free happy island.

Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191205164422.727968-1-chris@chris-wilson.co.uk


# 7983990c 28-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/selftests: Try to show where the pulse went

We have a case of a mysteriously absent pulse, so dump the engine
details to see if we can find out what happened to it.

References: https://bugs.freedesktop.org/show_bug.cgi?id=112405
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191128102546.3857140-1-chris@chris-wilson.co.uk


# 31177017 25-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Schedule request retirement when timeline idles

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster... Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
(cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# 4f88f874 25-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Schedule request retirement when timeline idles

The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX
corruption WA") is that it disables RC6 while Skylake (and friends) is
active, and we do not consider the GPU idle until all outstanding
requests have been retired and the engine switched over to the kernel
context. If userspace is idle, this task falls onto our background idle
worker, which only runs roughly once a second, meaning that userspace has
to have been idle for a couple of seconds before we enable RC6 again.
Naturally, this causes us to consume considerably more energy than
before as powersaving is effectively disabled while a display server
(here's looking at you Xorg) is running.

As execlists will get a completion event as each context is completed,
we can use this interrupt to queue a retire worker bound to this engine
to cleanup idle timelines. We will then immediately notice the idle
engine (without userspace intervention or the aid of the background
retire worker) and start parking the GPU. Thus during light workloads,
we will do much more work to idle the GPU faster... Hopefully with
commensurate power saving!

v2: Watch context completions and only look at those local to the engine
when retiring to reduce the amount of excess work we perform.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315
References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA")
References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk


# d231c15a 11-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Protect context while grabbing its name for the request

Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.

[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424] dump_stack+0x5b/0x90
[ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964] print_address_description.constprop.7+0x36/0x50
[ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947] __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241] print_request+0x82/0x2e0 [i915]
[ 1695.704638] ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042] ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915]

Fixes: c36eebd9ba5d ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk
(cherry picked from commit fecffa4668cf62e679aeea8caa9d0f241f822578)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>


# fecffa46 11-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Protect context while grabbing its name for the request

Inside print_request(), we query the context/timeline name. Nothing
immediately protects the context from being freed if the request is
complete -- we rely on serialisation by the caller to keep the name
valid until they finish using it. Inside intel_engine_dump(), we
generally only print the requests in the execution queue protected by the
engine->active.lock, but we also show the pending execlists ports which
are not protected and so require a rcu_read_lock to keep the pointer
valid.

[ 1695.700883] BUG: KASAN: use-after-free in i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.700981] Read of size 8 at addr ffff8887344f4d50 by task gem_ctx_persist/2968
[ 1695.701068]
[ 1695.701156] CPU: 1 PID: 2968 Comm: gem_ctx_persist Tainted: G U 5.4.0-rc6+ #331
[ 1695.701246] Hardware name: Intel Corporation NUC7i5BNK/NUC7i5BNB, BIOS BNKBL357.86A.0052.2017.0918.1346 09/18/2017
[ 1695.701334] Call Trace:
[ 1695.701424] dump_stack+0x5b/0x90
[ 1695.701870] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.701964] print_address_description.constprop.7+0x36/0x50
[ 1695.702408] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702856] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.702947] __kasan_report.cold.10+0x1a/0x3a
[ 1695.703390] ? i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.703836] i915_fence_get_timeline_name+0x53/0x90 [i915]
[ 1695.704241] print_request+0x82/0x2e0 [i915]
[ 1695.704638] ? fwtable_read32+0x133/0x360 [i915]
[ 1695.705042] ? write_timestamp+0x110/0x110 [i915]
[ 1695.705133] ? _raw_spin_lock_irqsave+0x79/0xc0
[ 1695.705221] ? refcount_inc_not_zero_checked+0x91/0x110
[ 1695.705306] ? refcount_dec_and_mutex_lock+0x50/0x50
[ 1695.705709] ? intel_engine_find_active_request+0x202/0x230 [i915]
[ 1695.706115] intel_engine_dump+0x2c9/0x900 [i915]

Fixes: c36eebd9ba5d ("drm/i915/gt: execlists->active is serialised by the tasklet")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191111114323.5833-1-chris@chris-wilson.co.uk


# 7caaed94 07-Nov-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Defer engine registration until fully initialised

Only add the engine to the available set of uabi engines once it has
been fully initialised and we know we want it in the public set.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Michał Wajdeczko <michal.wajdeczko@intel.com>
Cc: Andi Shyti <andi.shyti@intel.com>
Acked-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191107081252.10542-17-chris@chris-wilson.co.uk


# b79029b2 29-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Make timeslice duration configurable

Execlists uses a scheduling quantum (a timeslice) to alternate execution
between ready-to-run contexts of equal priority. This ensures that all
users (though only if they of equal importance) have the opportunity to
run and prevents livelocks where contexts may have implicit ordering due
to userspace semaphores. However, not all workloads necessarily benefit
from timeslicing and in the extreme some sysadmin may want to disable or
reduce the timeslicing granularity.

The timeslicing mechanism can be compiled out^W^W disabled (but should
DCE!) with

./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029091632.26281-1-chris@chris-wilson.co.uk


# 4dc0a7ca 29-Oct-2019 Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/i915: Don't try to place HWS in non-existing mappable region

HWS placement restrictions can't just rely on HAS_LLC flag.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191029095856.25431-5-matthew.auld@intel.com


# 2871ea85 24-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Split intel_ring_submission

Split the legacy submission backend from the common CS ring buffer
handling.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk


# 058179e7 23-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Replace hangcheck by heartbeats

Replace sampling the engine state every so often with a periodic
heartbeat request to measure the health of an engine. This is coupled
with the forced-preemption to allow long running requests to survive so
long as they do not block other users.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk


# 3a7a92ab 23-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/execlists: Force preemption

If the preempted context takes too long to relinquish control, e.g. it
is stuck inside a shader with arbitration disabled, evict that context
with an engine reset. This ensures that preemptions are reasonably
responsive, providing a tighter QoS for the more important context at
the cost of flagging unresponsive contexts more frequently (i.e. instead
of using an ~10s hangcheck, we now evict at ~100ms). The challenge of
lies in picking a timeout that can be reasonably serviced by HW for
typical workloads, balancing the existing clients against the needs for
responsiveness.

Note that coupled with timeslicing, this will lead to rapid GPU "hang"
detection with multiple active contexts vying for GPU time.

The forced preemption mechanism can be compiled out with

./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk


# a8c51ed2 23-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Try to more gracefully quiesce the system before resets

If we are doing a normal GPU reset triggered after detecting a long
period of stalled work, we can take our time and allow the engines to
quiesce. Since we've stopped submission to the engine, and if we wait
long enough an innocent context should complete, leaving the engine idle.
So by waiting a short amount of time, we should prevent clobbering other
users when resetting a stuck context.

Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Suggested-by: Jon Bloomfield <jon.bloomfield@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-1-chris@chris-wilson.co.uk


# 7841fcbd 22-Oct-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Pass intel_gt to intel_engines_init

Engines belong to the GT so make it indicative in the API.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-6-tvrtko.ursulin@linux.intel.com


# 78f60603 22-Oct-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Pass intel_gt to intel_engines_setup

Engines belong to the GT so make it indicative in the API.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-5-tvrtko.ursulin@linux.intel.com


# b0258bf2 22-Oct-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Pass intel_gt to intel_engines_cleanup

Engines belong to the GT so make it indicative in the API.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-4-tvrtko.ursulin@linux.intel.com


# 3ea951c6 22-Oct-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Pass intel_gt to intel_setup_engine_capabilities

Engines belong to the GT so make it indicative in the API.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-3-tvrtko.ursulin@linux.intel.com


# adcb5264 22-Oct-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Pass intel_gt to intel_engines_init_mmio

Engines belong to the GT so make it indicative in the API.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191022094726.3001-2-tvrtko.ursulin@linux.intel.com


# 5d904e3c 17-Oct-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Pass in intel_gt at some for_each_engine sites

Where the function, or code segment, operates on intel_gt, we need to
start passing it instead of i915 to for_each_engine(_masked).

This is another partial step in migration of i915->engines[] to
gt->engines[].

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191017094500.21831-2-tvrtko.ursulin@linux.intel.com


# a50134b1 17-Oct-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make for_each_engine_masked work on intel_gt

Medium term goal is to eliminate the i915->engine[] array and to get there
we have recently introduced equivalent array in intel_gt. Now we need to
migrate the code further towards this state.

This next step is to eliminate usage of i915->engines[] from the
for_each_engine_masked iterator.

For this to work we also need to use engine->id as index when populating
the gt->engine[] array and adjust the default engine set indexing to use
engine->legacy_idx instead of assuming gt->engines[] indexing.

v2:
* Populate gt->engine[] earlier.
* Check that we don't duplicate engine->legacy_idx

v3:
* Work around the initialization order issue between default_engines()
and intel_engines_driver_register() which sets engine->legacy_idx for
now. It will be fixed properly later.

v4:
* Merge with forgotten v2.5.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191017161852.8836-1-tvrtko.ursulin@linux.intel.com


# 2229adc8 16-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/execlist: Trim immediate timeslice expiry

We perform timeslicing immediately upon receipt of a request that may be
put into the second ELSP slot. The idea behind this was that since we
didn't install the timer if the second ELSP slot was empty, we would not
have any idea of how long ELSP[0] had been running and so giving the
newcomer a chance on the GPU was fair. However, this causes us extra
busy work that we may be able to avoid if we wait a jiffie for the first
timeslice as normal.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016100851.4979-1-chris@chris-wilson.co.uk


# e137d3ab 09-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: execlists->active is serialised by the tasklet

The active/pending execlists is no longer protected by the
engine->active.lock, but is serialised by the tasklet instead. Update
the locking around the debug and stats to follow suit.

v2: local_bh_disable() to prevent recursing into the tasklet in case we
trigger a softirq (Tvrtko)

Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009160906.16195-1-chris@chris-wilson.co.uk
(cherry picked from commit c36eebd9ba5d70b84e1e7408ccc7632566f285c4)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>


# c36eebd9 09-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: execlists->active is serialised by the tasklet

The active/pending execlists is no longer protected by the
engine->active.lock, but is serialised by the tasklet instead. Update
the locking around the debug and stats to follow suit.

v2: local_bh_disable() to prevent recursing into the tasklet in case we
trigger a softirq (Tvrtko)

Fixes: df403069029d ("drm/i915/execlists: Lift process_csb() out of the irq-off spinlock")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191009160906.16195-1-chris@chris-wilson.co.uk


# 6ad145fe 08-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Give engine->kernel_context distinct timeline lock classes

Assign a separate lockclass to the perma-pinned timelines of the
kernel_context, such that we can use them from within the user timelines
should we ever need to inject GPU operations to fixup faults during
request construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Auld <matthew.william.auld@gmail.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008185941.15228-1-chris@chris-wilson.co.uk


# d99f7b07 08-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Flush submission tasklet before waiting/retiring

A common bane of ours is arbitrary delays in ksoftirqd processing our
submission tasklet. Give the submission tasklet a kick before we wait to
avoid those delays eating into a tight timeout.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191008105655.13256-1-chris@chris-wilson.co.uk


# cd6a8513 07-Oct-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Prefer local path to runtime powermanagement

Avoid going to the base i915 device when we already have a path from gt
to the runtime powermanagement interface. The benefit is that it looks a
bit more self-consistent to always be acquiring the gt->uncore->rpm for
use with the gt->uncore.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191007154531.1750-1-chris@chris-wilson.co.uk


# b178a3f6 27-Sep-2019 Matthew Auld <matthew.auld@intel.com>

drm/i915: check for kernel_context

Explosions during early driver init on the error path. Make sure we fail
gracefully.

[ 9547.672258] BUG: kernel NULL pointer dereference, address: 000000000000007c
[ 9547.672288] #PF: supervisor read access in kernel mode
[ 9547.672292] #PF: error_code(0x0000) - not-present page
[ 9547.672296] PGD 8000000846b41067 P4D 8000000846b41067 PUD 797034067 PMD 0
[ 9547.672303] Oops: 0000 [#1] SMP PTI
[ 9547.672307] CPU: 1 PID: 25634 Comm: i915_selftest Tainted: G U 5.3.0-rc8+ #73
[ 9547.672313] Hardware name: /NUC6i7KYB, BIOS KYSKLi70.86A.0050.2017.0831.1924 08/31/2017
[ 9547.672395] RIP: 0010:intel_context_unpin+0x9/0x100 [i915]
[ 9547.672400] Code: 6b 60 00 e9 17 ff ff ff bd fc ff ff ff e9 7c ff ff ff 66 66 2e 0f 1f 84 00 00 00 00
00 0f 1f 40 00 0f 1f 44 00 00 41 54 55 53 <8b> 47 7c 83 f8 01 74 26 8d 48 ff f0 0f b1 4f 7c 48 8d 57 7c
75 05
[ 9547.672413] RSP: 0018:ffffae8ac24ff878 EFLAGS: 00010246
[ 9547.672417] RAX: ffff944a1b7842d0 RBX: ffff944a1b784000 RCX: ffff944a12dd6fa8
[ 9547.672422] RDX: ffff944a1b7842c0 RSI: ffff944a12dd5328 RDI: 0000000000000000
[ 9547.672428] RBP: 0000000000000000 R08: ffff944a11e5d840 R09: 0000000000000000
[ 9547.672433] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 9547.672438] R13: ffffffffc11aaf00 R14: 00000000ffffffe4 R15: ffff944a0e29bf38
[ 9547.672443] FS: 00007fc259b88ac0(0000) GS:ffff944a1f880000(0000) knlGS:0000000000000000
[ 9547.672449] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 9547.672454] CR2: 000000000000007c CR3: 0000000853346003 CR4: 00000000003606e0
[ 9547.672459] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 9547.672464] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 9547.672469] Call Trace:
[ 9547.672518] intel_engine_cleanup_common+0xe3/0x270 [i915]
[ 9547.672567] execlists_destroy+0xe/0x30 [i915]
[ 9547.672669] intel_engines_init+0x94/0xf0 [i915]
[ 9547.672749] i915_gem_init+0x191/0x950 [i915]

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190927173409.31175-2-matthew.auld@intel.com


# d19d71fc 18-Sep-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Mark i915_request.timeline as a volatile, rcu pointer

The request->timeline is only valid until the request is retired (i.e.
before it is completed). Upon retiring the request, the context may be
unpinned and freed, and along with it the timeline may be freed. We
therefore need to be very careful when chasing rq->timeline that the
pointer does not disappear beneath us. The vast majority of users are in
a protected context, either during request construction or retirement,
where the timeline->mutex is held and the timeline cannot disappear. It
is those few off the beaten path (where we access a second timeline) that
need extra scrutiny -- to be added in the next patch after first adding
the warnings about dangerous access.

One complication, where we cannot use the timeline->mutex itself, is
during request submission onto hardware (under spinlocks). Here, we want
to check on the timeline to finalize the breadcrumb, and so we need to
impose a second rule to ensure that the request->timeline is indeed
valid. As we are submitting the request, it's context and timeline must
be pinned, as it will be used by the hardware. Since it is pinned, we
know the request->timeline must still be valid, and we cannot submit the
idle barrier until after we release the engine->active.lock, ergo while
submitting and holding that spinlock, a second thread cannot release the
timeline.

v2: Don't be lazy inside selftests; hold the timeline->mutex for as long
as we need it, and tidy up acquiring the timeline with a bit of
refactoring (i915_active_add_request)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919111912.21631-1-chris@chris-wilson.co.uk


# bb120e11 15-Sep-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Show the logical context ring state on dumping

Include the active context register state when dumping the engine.

Suggested-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190915203701.29163-1-chris@chris-wilson.co.uk


# 325b916a 26-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/selftests: Ignore coherency failures on Broadwater

We've been ignoring similar coherency issues in IGT for Broadwater, and
specifically Broadwater (original gen4) and not, for example, Crestline
(same generation as Broadwater, but the mobile variant). Without any
means to reproduce locally (I have a 965GM but alas no 965G), fixing will
be slow, so tell CI to ignore any failure until we are ready with a fix.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190826133837.6784-1-chris@chris-wilson.co.uk


# eaef5b3c 23-Aug-2019 Stuart Summers <stuart.summers@intel.com>

drm/i915: Refactor instdone loops on new subslice functions

Refactor instdone loops to use the new intel_sseu_has_subslice
function.

Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-10-stuart.summers@intel.com


# 0aa5427a 17-Aug-2019 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915/tgl: Gen12 render context size

Re-use Gen11 context size for now.

[ Lucas: this is a temporary enabling patch that needs to be confirmed:
we need to check BSpec 46255 and recompute ]

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190817093902.2171-27-lucas.demarchi@intel.com


# df403069 16-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/execlists: Lift process_csb() out of the irq-off spinlock

If we only call process_csb() from the tasklet, though we lose the
ability to bypass ksoftirqd interrupt processing on direct submission
paths, we can push it out of the irq-off spinlock.

The penalty is that we then allow schedule_out to be called concurrently
with schedule_in requiring us to handle the usage count (baked into the
pointer itself) atomically.

As we do kick the tasklets (via local_bh_enable()) after our submission,
there is a possibility there to see if we can pull the local softirq
processing back from the ksoftirqd.

v2: Store the 'switch_priority_hint' on submission, so that we can
safely check during process_csb().

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190816171608.11760-1-chris@chris-wilson.co.uk


# e5dadff4 15-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Protect request retirement with timeline->mutex

Forgo the struct_mutex requirement for request retirement as we have
been transitioning over to only using the timeline->mutex for
controlling the lifetime of a request on that timeline.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-4-chris@chris-wilson.co.uk


# b26496ae 13-Aug-2019 Stuart Summers <stuart.summers@intel.com>

drm/i915: Print CCID for all renderCS

Use render class instead of RCS0 when printing CCID.

Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190813174121.129593-2-stuart.summers@intel.com


# a4eb99a1 13-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Include engine->mmio_base in the debug dump

Some IGT would like to know the mmio address of each engine so make it
available.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Stuart Summers <stuart.summers@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190813215707.14703-1-chris@chris-wilson.co.uk


# 4ecd20c9 12-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Use the local engine wakeref when checking RING registers

Now that we can atomically acquire the engine wakeref, make use of it
when check whether the RING registers are idle.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190812091045.29587-7-chris@chris-wilson.co.uk


# 75d0a7f3 09-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Lift timeline into intel_context

Move the timeline from being inside the intel_ring to intel_context
itself. This saves much pointer dancing and makes the relations of the
context to its timeline much clearer.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-4-chris@chris-wilson.co.uk


# 48ae397b 09-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Push the ring creation flags to the backend

Push the ring creation flags from the outer GEM context to the inner
intel_context to avoid an unsightly back-reference from inside the
backend.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Andi Shyti <andi.shyti@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-3-chris@chris-wilson.co.uk


# c7302f20 08-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Defer final intel_wakeref_put to process context

As we need to acquire a mutex to serialise the final
intel_wakeref_put, we need to ensure that we are in process context at
that time. However, we want to allow operation on the intel_wakeref from
inside timer and other hardirq context, which means that need to defer
that final put to a workqueue.

Inside the final wakeref puts, we are safe to operate in any context, as
we are simply marking up the HW and state tracking for the potential
sleep. It's only the serialisation with the potential sleeping getting
that requires careful wait avoidance. This allows us to retain the
immediate processing as before (we only need to sleep over the same
races as the current mutex_lock).

v2: Add a selftest to ensure we exercise the code while lockdep watches.
v3: That test was extremely loud and complained about many things!
v4: Not a whale!

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111295
References: https://bugs.freedesktop.org/show_bug.cgi?id=111245
References: https://bugs.freedesktop.org/show_bug.cgi?id=111256
Fixes: 18398904ca9e ("drm/i915: Only recover active engines")
Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808202758.10453-1-chris@chris-wilson.co.uk


# 38775829 07-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Allocate kernel_contexts directly

Ignore the central i915->kernel_context for allocating an engine, as
that GEM context is being phased out. For internal clients, we just need
the per-engine logical state, so allocate it at the point of use.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190808110612.23539-1-chris@chris-wilson.co.uk


# 2edda80d 06-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename engines to match their user interface

During engine setup, we may find that some engines are fused off causing
a misalignment between internal names and the instances seen by users,
e.g. (I915_ENGINE_CLASS_VIDEO_DECODE, 1) may be vcs2 in hardware.
Normally this is invisible to the user, but a few debug interfaces (and
our own internal tracing) use the original HW name not the name the user
would expect as formed from their class:instance tuple. Replace our
internal name with the uabi name for consistency with, for example, error
states.

v2: Keep the pretty printing of class name in the selftest

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111311
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190807110431.8130-1-chris@chris-wilson.co.uk


# 750e76b4 06-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Move the [class][inst] lookup for engines onto the GT

To maintain a fast lookup from a GT centric irq handler, we want the
engine lookup tables on the intel_gt. To avoid having multiple copies of
the same multi-dimension lookup table, move the generic user engine
lookup into an rbtree (for fast and flexible indexing).

v2: Split uabi_instance cf uabi_class
v3: Set uabi_class/uabi_instance after collating all engines to provide a
stable uabi across parallel unordered construction.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2
Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk


# b40d7378 04-Aug-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace struct_mutex for batch pool serialisation

Switch to tracking activity via i915_active on individual nodes, only
keeping a list of retired objects in the cache, and reaping the cache
when the engine itself idles.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190804124826.30272-2-chris@chris-wilson.co.uk


# 50d84418 02-Aug-2019 Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/i915: Add i915 to i915_inject_probe_failure

With i915 added to i915_inject_probe_failure we can use dedicated
printk when injecting artificial load failure.

Also make this function look like other i915 functions that return
error code and make it more flexible to return any provided error
code instead of previously assumed -ENODEV.

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190802184055.31988-2-michal.wajdeczko@intel.com


# 0bbfdce3 17-Jul-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Fix GEN8_MCR_SELECTOR programming

fls returns bit positions starting from one for the lsb and the MCR
register expects zero based (sub)slice addressing.

Incorrent MCR programming can have the effect of directing MMIO reads of
registers in the 0xb100-0xb3ff range to invalid subslice returning zeroes
instead of actual content.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 1e40d4aea57b ("drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190717180624.20354-2-tvrtko.ursulin@linux.intel.com
(cherry picked from commit 15160879d47213c32f357bc67b6014d9aaf14ed7)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 982b1d00 15-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Lock the engine while dumping the active request

We cannot let the request be retired and freed while we are trying to
dump it during error capture. It is not sufficient just to grab a
reference to the request, as during retirement we may free the ring
which we are also dumping. So take the engine lock to prevent retiring
and freeing of the request.

Reported-by: Alex Shumsky <alexthreed@gmail.com>
Fixes: 83c317832eb1 ("drm/i915: Dump the ringbuffer of the active request for debugging")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Alex Shumsky <alexthreed@gmail.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190715080946.15593-6-chris@chris-wilson.co.uk
(cherry picked from commit cfe7288c276e359eebf057699fe86c2f8af14224)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>


# 6c2b0103 17-Jul-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Fix and improve MCR selection logic

A couple issues were present in this code:

1.
fls() usage was incorrect causing off by one in subslice mask lookup,
which in other words means subslice mask of all zeroes is always used
(subslice mask of a slice which is not present, or even out of bounds
array access), rendering the checks in wa_init_mcr either futile or
random.

2.
Condition in WARN_ON was not correct. It is doing a bitwise and operation
between a positive (present subslices) and negative mask (disabled L3
banks).

This means that with corrected fls() usage the assert would always
incorrectly fail.

We could fix this by inverting the fuse bits in the check, but instead do
one better and improve the code so it not only asserts, but finds the
first common index between the two masks and only warns if no such index
can be found.

v2:
* Simplify check for logic and redability.
* Improve commentary explaining what is really happening ie. what the
assert is really trying to check and why.

v3:
* Find first common index instead of just asserting.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: fe864b76c2ab ("drm/i915: Implement WaProgramMgsrForL3BankSpecificMmioReads")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Cc: Michał Winiarski <michal.winiarski@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190717180624.20354-4-tvrtko.ursulin@linux.intel.com


# 7405cb77 17-Jul-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Trust programmed MCR in read_subslice_reg

Instead of re-calculating the MCR selector in read_subslice_reg do the
rwm on its existing value and restore it when done.

This consolidates MCR programming to one place for cnl+, and avoids
re-calculating its default value on older platforms during hangcheck.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190717180624.20354-3-tvrtko.ursulin@linux.intel.com


# 15160879 17-Jul-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Fix GEN8_MCR_SELECTOR programming

fls returns bit positions starting from one for the lsb and the MCR
register expects zero based (sub)slice addressing.

Incorrent MCR programming can have the effect of directing MMIO reads of
registers in the 0xb100-0xb3ff range to invalid subslice returning zeroes
instead of actual content.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 1e40d4aea57b ("drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads")
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190717180624.20354-2-tvrtko.ursulin@linux.intel.com


# 09975b86 09-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/execlists: Disable preemption under GVT

Preempt-to-busy uses a GPU semaphore to enforce an idle-barrier across
preemption, but mediated gvt does not fully support semaphores.

v2: Fiddle around with the flags and settle on using has-semaphores for
the core bits so that we retain the ability to preempt our own
semaphores.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: Xiaolin Zhang <xiaolin.zhang@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Acked-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190709091233.8573-1-chris@chris-wilson.co.uk


# cfe7288c 15-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Lock the engine while dumping the active request

We cannot let the request be retired and freed while we are trying to
dump it during error capture. It is not sufficient just to grab a
reference to the request, as during retirement we may free the ring
which we are also dumping. So take the engine lock to prevent retiring
and freeing of the request.

Reported-by: Alex Shumsky <alexthreed@gmail.com>
Fixes: 83c317832eb1 ("drm/i915: Dump the ringbuffer of the active request for debugging")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Alex Shumsky <alexthreed@gmail.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190715080946.15593-6-chris@chris-wilson.co.uk


# cb823ed9 12-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Use intel_gt as the primary object for handling resets

Having taken the first step in encapsulating the functionality by moving
the related files under gt/, the next step is to start encapsulating by
passing around the relevant structs rather than the global
drm_i915_private. In this step, we pass intel_gt to intel_reset.c

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712192953.9187-1-chris@chris-wilson.co.uk


# f2db53f1 12-Jul-2019 Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>

drm/i915: Replace "_load" with "_probe" consequently

Use the "_probe" nomenclature not only in i915_driver_probe() helper
name but also in other related function / variable names for
consistency. Only the userspace exposed name of a related module
parameter is left untouched.

Signed-off-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190712112429.740-4-janusz.krzysztofik@linux.intel.com


# 71b0846c 09-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/guc: Remove preemption support for current fw

Preemption via GuC submission is not being supported with its current
legacy incarnation. The current FW does support a similar pre-emption
flow via H2G, but it is class-based instead of being instance-based,
which doesn't fit well with the i915 tracking. To fix this, the
firmware is being updated to better support our needs with a new flow,
so we can safely remove the old code.

v2 (Daniele): resurrect & rebase, reword commit message, remove
preempt_context as well

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Cc: John Harrison <John.C.Harrison@Intel.com>
Acked-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Michał Winiarski <michal.winiarski@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190710005437.3496-2-daniele.ceraolospurio@intel.com


# 4a5fdc96 05-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Remove presumption of RCS0

We now track features correctly instead of probing i915->engine[RCS0]
which is much more flexible and avoids any nasty surprises.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190705124325.14270-2-chris@chris-wilson.co.uk


# ab9e2f77 03-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/gt: Pull engine w/a initialisation into common

We need to setup the workarounds on all engines, with the knowledge
about which platforms each workaround applies to kept together in the
workaround list. As such, we can pull the w/a initialisation into the
common setup and try to avoid duplicating knowledge about when to setup
the workarounds.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703135805.7310-2-chris@chris-wilson.co.uk


# bf73fc0f 03-Jul-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Show support for accurate sw PMU busyness tracking

Expose whether or not we support the PMU software tracking in our
scheduler capabilities, so userspace can query at runtime.

v2: Use I915_SCHEDULER_CAP_ENGINE_BUSY_STATS for a less ambiguous
capability name.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190703143702.11339-1-chris@chris-wilson.co.uk


# 315ca4c4 02-Jul-2019 Ville Syrjälä <ville.syrjala@linux.intel.com>

drm/i915: synchronize_irq() against the actual irq

When eliminating our use of drm_irq_install() I failed to convert
all our synchronize_irq() calls to consult pdev->irq instead of
dev_priv->drm.irq. As we no longer populate dev_priv->drm.irq
we're no longer synchronizing against anything.

v2: Add intel_syncrhonize_irq() (Chris)

Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reported-by: Imre Deak <imre.deak@intel.com>
Fixes: b318b82455bd ("drm/i915: Nuke drm_driver irq vfuncs")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111012
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190702151723.29739-1-ville.syrjala@linux.intel.com


# 5f22e5b3 25-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Rename intel_wakeref_[is]_active

Our general rule is to use is/has as the verb for boolean functions,
rename intel_wakeref_active to intel_wakeref_is_active so the question
being asked is clear.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190625130128.11009-6-chris@chris-wilson.co.uk


# db56f974 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Eliminate dual personality of i915_scratch_offset

Scratch vma lives under gt but the API used to work on i915. Make this
consistent by renaming the function to intel_gt_scratch_offset and make
it take struct intel_gt.

v2:
* Move to intel_gt. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-33-tvrtko.ursulin@linux.intel.com


# f0c02c1b 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Rename i915_timeline to intel_timeline and move under gt

Move all timeline code under gt and rename to intel_gt prefix.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com


# 4c6d51ea 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make timelines gt centric

Our timelines are stored inside intel_gt so we can convert the interface
to take exactly that and not i915.

At the same time re-order the params to our more typical layout and
replace the backpointer to the new containing structure.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-31-tvrtko.ursulin@linux.intel.com


# ba4134a4 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Save trip via top-level i915 in a few more places

For gt related operations it makes more logical sense to stay in the realm
of gt instead of dereferencing via driver i915.

This patch handles a few of the easy ones with work requiring more
refactoring still outstanding.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-30-tvrtko.ursulin@linux.intel.com


# f937f561 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Store backpointer to intel_gt in the engine

It will come useful in the next patch.

v2:
* Do mock_engine as well.

v3:
* And the virtual engine...

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-11-tvrtko.ursulin@linux.intel.com


# eaf522f6 21-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make i915_check_and_clear_faults take intel_gt

Continuing the conversion and elimination of implicit dev_priv.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-6-tvrtko.ursulin@linux.intel.com


# 22b7a426 20-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/execlists: Preempt-to-busy

When using a global seqno, we required a precise stop-the-workd event to
handle preemption and unwind the global seqno counter. To accomplish
this, we would preempt to a special out-of-band context and wait for the
machine to report that it was idle. Given an idle machine, we could very
precisely see which requests had completed and which we needed to feed
back into the run queue.

However, now that we have scrapped the global seqno, we no longer need
to precisely unwind the global counter and only track requests by their
per-context seqno. This allows us to loosely unwind inflight requests
while scheduling a preemption, with the enormous caveat that the
requests we put back on the run queue are still _inflight_ (until the
preemption request is complete). This makes request tracking much more
messy, as at any point then we can see a completed request that we
believe is not currently scheduled for execution. We also have to be
careful not to rewind RING_TAIL past RING_HEAD on preempting to the
running context, and for this we use a semaphore to prevent completion
of the request before continuing.

To accomplish this feat, we change how we track requests scheduled to
the HW. Instead of appending our requests onto a single list as we
submit, we track each submission to ELSP as its own block. Then upon
receiving the CS preemption event, we promote the pending block to the
inflight block (discarding what was previously being tracked). As normal
CS completion events arrive, we then remove stale entries from the
inflight tracker.

v2: Be a tinge paranoid and ensure we flush the write into the HWS page
for the GPU semaphore to pick in a timely fashion.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk


# eca15360 18-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Don't dereference request if it may have been retired when printing

This has caught me out on countless occasions, when we retrieve a pointer
from the submission/execlists backend, it does not carry a reference to
the context or ring. Those are only pinned while the request is active,
so if we see the request is already completed, it may be in the process
of being retired and those pointers defunct.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110938
Fixes: 3a068721a973 ("drm/i915: Show ring->start for the ELSP context/request queue")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190618161951.28820-2-chris@chris-wilson.co.uk


# 422d7df4 14-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Replace engine->timeline with a plain list

To continue the onslaught of removing the assumption of a global
execution ordering, another casualty is the engine->timeline. Without an
actual timeline to track, it is overkill and we can replace it with a
much less grand plain list. We still need a list of requests inflight,
for the simple purpose of finding inflight requests (for retiring,
resetting, preemption etc).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk


# ce476c80 14-Jun-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Keep contexts pinned until after the next kernel context switch

We need to keep the context image pinned in memory until after the GPU
has finished writing into it. Since it continues to write as we signal
the final breadcrumb, we need to keep it pinned until the request after
it is complete. Currently we know the order in which requests execute on
each engine, and so to remove that presumption we need to identify a
request/context-switch we know must occur after our completion. Any
request queued after the signal must imply a context switch, for
simplicity we use a fresh request from the kernel context.

The sequence of operations for keeping the context pinned until saved is:

- On context activation, we preallocate a node for each physical engine
the context may operate on. This is to avoid allocations during
unpinning, which may be from inside FS_RECLAIM context (aka the
shrinker)

- On context deactivation on retirement of the last active request (which
is before we know the context has been saved), we add the
preallocated node onto a barrier list on each engine

- On engine idling, we emit a switch to kernel context. When this
switch completes, we know that all previous contexts must have been
saved, and so on retiring this request we can finally unpin all the
contexts that were marked as deactivated prior to the switch.

We can enhance this in future by flushing all the idle contexts on a
regular heartbeat pulse of a switch to kernel context, which will also
be used to check for hung engines.

v2: intel_context_active_acquire/_release

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk


# d858d569 13-Jun-2019 Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>

drm/i915: update rpm_get/put to use the rpm structure

The functions where internally already only using the structure, so we
need to just flip the interface.

v2: rebase

Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190613232156.34940-7-daniele.ceraolospurio@intel.com


# f398bbde 10-Jun-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Make read_subslice_reg take engine

The function operates on the render engine so make the input reflect it.

v2:
* Pass engine to read_subslice_reg. (Chris)
* Drop inline from read_subslice_reg.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190610125706.26110-1-tvrtko.ursulin@linux.intel.com


# a10f361d 29-May-2019 Jani Nikula <jani.nikula@intel.com>

Revert "drm/i915: Expand subslice mask"

This reverts commit 1ac159e23c2c ("drm/i915: Expand subslice mask"),
which kills ICL due to GEM_BUG_ON() sanity checks before CI even gets a
chance to do anything.

The commit exposes an issue in commit 1e40d4aea57b ("drm/i915/cnl:
Implement WaProgramMgsrForCorrectSliceSpecificMmioReads"), which will
also need to be addressed.

There's a proposed fix [1], but considering the seeming uncertainty with
the fix as well as the size of the regressing commit (in this context,
the one that actually brings down ICL), this warrants a revert to get
ICL working, and gives us time to get all of this right without
rushing. Even if this means shooting the messenger.

<3>[ 9.426327] intel_sseu_get_subslices:46 GEM_BUG_ON(slice >= sseu->max_slices)
<4>[ 9.426355] ------------[ cut here ]------------
<2>[ 9.426357] kernel BUG at drivers/gpu/drm/i915/gt/intel_sseu.c:46!
<4>[ 9.426371] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
<4>[ 9.426377] CPU: 1 PID: 364 Comm: systemd-udevd Not tainted 5.2.0-rc2-CI-CI_DRM_6159+ #1
<4>[ 9.426385] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019
<4>[ 9.426444] RIP: 0010:intel_sseu_get_subslices+0x8a/0xe0 [i915]
<4>[ 9.426452] Code: d5 76 b7 e0 48 8b 35 9d 24 21 00 49 c7 c0 07 f0 72 a0 b9 2e 00 00 00 48 c7 c2 00 8e 6d a0 48 c7 c7 a5 14 5b a0 e8 36 3c be e0 <0f> 0b 48 c7 c1 80 d5 6f a0 ba 30 00 00 00 48 c7 c6 00 8e 6d a0 48
<4>[ 9.426468] RSP: 0018:ffffc9000037b9c8 EFLAGS: 00010282
<4>[ 9.426475] RAX: 000000000000000f RBX: 0000000000000000 RCX: 0000000000000000
<4>[ 9.426482] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88849e346f98
<4>[ 9.426490] RBP: ffff88848a200000 R08: 0000000000000004 R09: ffff88849d50b000
<4>[ 9.426497] R10: 0000000000000000 R11: ffff88849e346f98 R12: ffff88848a209e78
<4>[ 9.426505] R13: 0000000003000000 R14: ffff88848a20b1a8 R15: 0000000000000000
<4>[ 9.426513] FS: 00007f73d5ae8680(0000) GS:ffff88849fc80000(0000) knlGS:0000000000000000
<4>[ 9.426521] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 9.426527] CR2: 0000561417b01260 CR3: 0000000494764003 CR4: 0000000000760ee0
<4>[ 9.426535] PKRU: 55555554
<4>[ 9.426538] Call Trace:
<4>[ 9.426585] wa_init_mcr+0xd5/0x110 [i915]
<4>[ 9.426597] ? lock_acquire+0xa6/0x1c0
<4>[ 9.426645] icl_gt_workarounds_init+0x21/0x1a0 [i915]
<4>[ 9.426694] ? i915_driver_load+0xfcf/0x18a0 [i915]
<4>[ 9.426739] gt_init_workarounds+0x14c/0x230 [i915]
<4>[ 9.426748] ? _raw_spin_unlock_irq+0x24/0x50
<4>[ 9.426789] intel_gt_init_workarounds+0x1b/0x30 [i915]
<4>[ 9.426835] i915_driver_load+0xfd7/0x18a0 [i915]
<4>[ 9.426843] ? lock_acquire+0xa6/0x1c0
<4>[ 9.426850] ? __pm_runtime_resume+0x4f/0x80
<4>[ 9.426857] ? _raw_spin_unlock_irqrestore+0x4c/0x60
<4>[ 9.426863] ? _raw_spin_unlock_irqrestore+0x4c/0x60
<4>[ 9.426870] ? lockdep_hardirqs_on+0xe3/0x1b0
<4>[ 9.426915] i915_pci_probe+0x29/0xa0 [i915]
<4>[ 9.426923] pci_device_probe+0x9e/0x120
<4>[ 9.426930] really_probe+0xea/0x3c0
<4>[ 9.426936] driver_probe_device+0x10b/0x120
<4>[ 9.426942] device_driver_attach+0x4a/0x50
<4>[ 9.426948] __driver_attach+0x97/0x130
<4>[ 9.426954] ? device_driver_attach+0x50/0x50
<4>[ 9.426960] bus_for_each_dev+0x74/0xc0
<4>[ 9.426966] bus_add_driver+0x13f/0x210
<4>[ 9.426971] ? 0xffffffffa083b000
<4>[ 9.426976] driver_register+0x56/0xe0
<4>[ 9.426982] ? 0xffffffffa083b000
<4>[ 9.426987] do_one_initcall+0x58/0x300
<4>[ 9.426994] ? do_init_module+0x1d/0x1f6
<4>[ 9.427001] ? rcu_read_lock_sched_held+0x6f/0x80
<4>[ 9.427007] ? kmem_cache_alloc_trace+0x261/0x290
<4>[ 9.427014] do_init_module+0x56/0x1f6
<4>[ 9.427020] load_module+0x24d1/0x2990
<4>[ 9.427032] ? __se_sys_finit_module+0xd3/0xf0
<4>[ 9.427037] __se_sys_finit_module+0xd3/0xf0
<4>[ 9.427047] do_syscall_64+0x55/0x1c0
<4>[ 9.427053] entry_SYSCALL_64_after_hwframe+0x49/0xbe
<4>[ 9.427059] RIP: 0033:0x7f73d5609839
<4>[ 9.427064] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48
<4>[ 9.427082] RSP: 002b:00007ffdf34477b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
<4>[ 9.427091] RAX: ffffffffffffffda RBX: 00005559fd5d7b40 RCX: 00007f73d5609839
<4>[ 9.427099] RDX: 0000000000000000 RSI: 00007f73d52e8145 RDI: 000000000000000f
<4>[ 9.427106] RBP: 00007f73d52e8145 R08: 0000000000000000 R09: 00007ffdf34478d0
<4>[ 9.427114] R10: 000000000000000f R11: 0000000000000246 R12: 0000000000000000
<4>[ 9.427121] R13: 00005559fd5c90f0 R14: 0000000000020000 R15: 00005559fd5d7b40
<4>[ 9.427131] Modules linked in: i915(+) mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hwdep e1000e snd_hda_core ghash_clmulni_intel ptp snd_pcm cdc_ether usbnet mii pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc
<4>[ 9.427254] ---[ end trace af3eeb543bd66e66 ]---

[1] http://patchwork.freedesktop.org/patch/msgid/20190528200655.11605-1-chris@chris-wilson.co.uk

References: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6159/fi-icl-u2/pstore0-1517155098_Oops_1.log
References: 1e40d4aea57b ("drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads")
Fixes: 1ac159e23c2c ("drm/i915: Expand subslice mask")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Manasi Navare <manasi.d.navare@intel.com>
Cc: Michel Thierry <michel.thierry@intel.com>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Cc: Oscar Mateo <oscar.mateo@intel.com>
Cc: Stuart Summers <stuart.summers@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Yunwei Zhang <yunwei.zhang@intel.com>
Acked-by: Daniel Vetter <daniel@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190529082150.31526-1-jani.nikula@intel.com


# 1ac159e2 24-May-2019 Stuart Summers <stuart.summers@intel.com>

drm/i915: Expand subslice mask

Currently, the subslice_mask runtime parameter is stored as an
array of subslices per slice. Expand the subslice mask array to
better match what is presented to userspace through the
I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is
then calculated:
slice * subslice stride + subslice index / 8

v2: fix spacing in set_sseu_info args
use set_sseu_info to initialize sseu data when building
device status in debugfs
rename variables in intel_engine_types.h to avoid checkpatch
warnings
v3: update headers in intel_sseu.h
v4: add const to some sseu_dev_info variables
use sseu->eu_stride for EU stride calculations
v5: address review comments from Tvrtko and Daniele
v6: remove extra space in intel_sseu_get_subslices
return the correct subslice enable in for_each_instdone
add GEM_BUG_ON to ensure user doesn't pass invalid ss_mask size
use printk formatted string for subslice mask
v7: remove string.h header and rebase

Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Manasi Navare <manasi.d.navare@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190524154022.13575-6-stuart.summers@intel.com


# c017cf6b 28-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Drop the deferred active reference

An old optimisation to reduce the number of atomics per batch sadly
relies on struct_mutex for coordination. In order to remove struct_mutex
from serialising object/context closing, always taking and releasing an
active reference on first use / last use greatly simplifies the locking.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-15-chris@chris-wilson.co.uk


# 10be98a7 28-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move more GEM objects under gem/

Continuing the theme of separating out the GEM clutter.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-8-chris@chris-wilson.co.uk


# ffd5ce22 27-May-2019 Michal Wajdeczko <michal.wajdeczko@intel.com>

drm/i915/guc: Updates for GuC 32.0.3 firmware

New GuC 32.0.3 firmware made many changes around its ABI that
require driver updates:

* FW release version numbering schema now includes patch number
* FW release version encoding in CSS header
* Boot parameters
* Suspend/resume protocol
* Sample-forcewake command
* Additional Data Structures (ADS)

This commit is a squash of patches 3-8 from series [1].
[1] https://patchwork.freedesktop.org/series/58760/

Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Cc: Anusha Srivatsa <anusha.srivatsa@intel.com>
Cc: Jeff Mcgee <jeff.mcgee@intel.com>
Cc: John Spotswood <john.a.spotswood@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Tomasz Lis <tomasz.lis@intel.com>
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # numbering schema
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # ccs heaser
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # boot params
Acked-by: John Spotswood <john.a.spotswood@intel.com> # suspend/resume
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # sample-forcewake
Acked-by: John Spotswood <john.a.spotswood@intel.com> # sample-forcewake
Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> # ADS
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190527183613.17076-4-michal.wajdeczko@intel.com


# c5d3e39c 22-May-2019 Tvrtko Ursulin <tvrtko.ursulin@intel.com>

drm/i915: Engine discovery query

Engine discovery query allows userspace to enumerate engines, probe their
configuration features, all without needing to maintain the internal PCI
ID based database.

A new query for the generic i915 query ioctl is added named
DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure
drm_i915_query_engine_info. The address of latter should be passed to the
kernel in the query.data_ptr field, and should be large enough for the
kernel to fill out all known engines as struct drm_i915_engine_info
elements trailing the query.

As with other queries, setting the item query length to zero allows
userspace to query minimum required buffer size.

Enumerated engines have common type mask which can be used to query all
hardware engines, versus engines userspace can submit to using the execbuf
uAPI.

Engines also have capabilities which are per engine class namespace of
bits describing features not present on all engine instances.

v2:
* Fixed HEVC assignment.
* Reorder some fields, rename type to flags, increase width. (Lionel)
* No need to allocate temporary storage if we do it engine by engine.
(Lionel)

v3:
* Describe engine flags and mark mbz fields. (Lionel)
* HEVC only applies to VCS.

v4:
* Squash SFC flag into main patch.
* Tidy some comments.

v5:
* Add uabi_ prefix to engine capabilities. (Chris Wilson)
* Report exact size of engine info array. (Chris Wilson)
* Drop the engine flags. (Joonas Lahtinen)
* Added some more reserved fields.
* Move flags after class/instance.

v6:
* Do not check engine info array was zeroed by userspace but zero the
unused fields for them instead.

v7:
* Simplify length calculation loop. (Lionel Landwerlin)

v8:
* Remove MBZ comments where not applicable.
* Rename ABI flags to match engine class define naming.
* Rename SFC ABI flag to reflect it applies to VCS and VECS.
* SFC is wired to even _logical_ engine instances.
* SFC applies to VCS and VECS.
* HEVC is present on all instances on Gen11. (Tony)
* Simplify length calculation even more. (Chris Wilson)
* Move info_ptr assigment closer to loop for clarity. (Chris Wilson)
* Use vdbox_sfc_access from runtime info.
* Rebase for RUNTIME_INFO.
* Refactor for lower indentation.
* Rename uAPI class/instance to engine_class/instance to avoid C++
keyword.

v9:
* Rebase for s/num_rings/num_engines/ in RUNTIME_INFO.

v10:
* Use new copy_query_item.

v11:
* Consolidate with struct i915_engine_class_instnace.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tony Ye <tony.ye@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com


# 519a0194 08-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD

After realising we need to sample RING_START to detect context switches
from preemption events that do not allow for the seqno to advance, we
can also realise that the seqno itself is just a distance along the ring
and so can be replaced by sampling RING_HEAD.

v2: Bonus comment for the mystery separate CS_STALL before MI_USER_INTERRUPT

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190508080704.24223-1-chris@chris-wilson.co.uk


# dc58958d 02-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Assert the local engine->wakeref is active

Due to the asynchronous tasklet and recursive GT wakeref, it may happen
that we submit to the engine (underneath it's own wakeref) prior to the
central wakeref being marked as taken. Switch to checking the local wakeref
for greater consistency.

Fixes: 79ffac8599c4 ("drm/i915: Invert the GEM wakeref hierarchy")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190503115225.30831-3-chris@chris-wilson.co.uk


# c34c5bca 03-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915/execlists: Flush the tasklet on parking

Tidy up the cleanup sequence by always ensure that the tasklet is
flushed on parking (before we cleanup). The parking provides a
convenient point to ensure that the backend is truly idle.

v2: Do the full check for idleness before parking, to be sure we flush
any residual interrupt.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190503080942.30151-1-chris@chris-wilson.co.uk


# 8c334f24 30-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Include fence signaled bit in print_request()

Show the fence flags view of request completion in addition to the
normal hwsp check and whether signaling is enabled.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501114541.10077-2-chris@chris-wilson.co.uk


# 45b9c968 01-May-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move the engine->destroy() vfunc onto the engine

Make the engine responsible for cleaning itself up!

This removes the i915->gt.cleanup vfunc that has been annoying the
casual reader and myself for the last several years, and helps keep a
future patch to add more cleanup tidy.

v2: Assert that engine->destroy is set after the backend starts
allocating its own state.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501103204.18632-1-chris@chris-wilson.co.uk


# 5e2a0419 26-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Switch back to an array of logical per-engine HW contexts

We switched to a tree of per-engine HW context to accommodate the
introduction of virtual engines. However, we plan to also support
multiple instances of the same engine within the GEM context, defeating
our use of the engine as a key to looking up the HW context. Just
allocate a logical per-engine instance and always use an index into the
ctx->engines[]. Later on, this ctx->engines[] may be replaced by a user
specified map.

v2: Add for_each_gem_engine() helper to iterator within the engines lock
v3: intel_context_create_request() helper
v4: s/unsigned long/unsigned int/ 4 billion engines is quite enough.
v5: Push iterator locking to caller

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-7-chris@chris-wilson.co.uk


# 11334c6a 26-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Split engine setup/init into two phases

In the next patch, we require the engine vfuncs setup prior to
initialising the pinned kernel contexts, so split the vfunc setup from
the engine initialisation and call it earlier.

v2: s/setup_xcs/setup_common/ for intel_ring_submission_setup()

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-6-chris@chris-wilson.co.uk


# fa9f6681 26-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Export intel_context_instance()

We want to pass in a intel_context into intel_context_pin() and that
requires us to first be able to lookup the intel_context!

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190426163336.15906-2-chris@chris-wilson.co.uk


# 9ce9bdb0 19-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Enable render context support for gen4 (Broadwater to Cantiga)

Broadwater and the rest of gen4 do support being able to saving and
reloading context specific registers between contexts, providing isolation
of the basic GPU state (as programmable by userspace). This allows
userspace to assume that the GPU retains their state from one batch to the
next, minimising the amount of state it needs to reload and manually save
across batches.

v2: CONSTANT_BUFFER woes

Running through piglit turned up an interesting issue, a GPU hang inside
the context load. The context image includes the CONSTANT_BUFFER command
that loads an address into a on-gpu buffer, and the context load was
executing that immediately. However, since it was reading from the GTT
there is no guarantee that the GTT retains the same configuration as
when the context was saved, resulting in stray reads and a GPU hang.

Having tried issuing a CONSTANT_BUFFER (to disable the command) from the
ring before saving the context to no avail, we resort to patching out
the instruction inside the context image before loading.

This does impose that gen4 always reissues CONSTANT_BUFFER commands on
each batch, but due to the use of a shared GTT that was and will remain
a requirement.

v3: ECOSKPD to the rescue

Ville found the magic bit in the ECOSKPD to disable saving and restoring
the CONSTANT_BUFFER from the context image, thereby completely avoiding
the GPU hangs from chasing invalid pointers. This appears to be the
default behaviour for gen5, and so we just need to tweak gen4 to match.

v4: Fix spelling of ECOSKPD and discover it already exists

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419172720.5462-1-chris@chris-wilson.co.uk


# 1215d28e 18-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Enable render context support for Ironlake (gen5)

Ironlake does support being able to saving and reloading context specific
registers between contexts, providing isolation of the basic GPU state
(as programmable by userspace). This allows userspace to assume that the
GPU retains their state from one batch to the next, minimising the
amount of state it needs to reload, or manually save and restore.

v2: Fix off-by-one in reading CXT_SIZE, and add a comment that the
CXT_SIZE and context-layout do not match in bspec, but the difference is
irrelevant as we overallocate the full page anyway (Ville).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190419111749.3910-2-chris@chris-wilson.co.uk


# 79ffac85 24-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Invert the GEM wakeref hierarchy

In the current scheme, on submitting a request we take a single global
GEM wakeref, which trickles down to wake up all GT power domains. This
is undesirable as we would like to be able to localise our power
management to the available power domains and to remove the global GEM
operations from the heart of the driver. (The intent there is to push
global GEM decisions to the boundary as used by the GEM user interface.)

Now during request construction, each request is responsible via its
logical context to acquire a wakeref on each power domain it intends to
utilize. Currently, each request takes a wakeref on the engine(s) and
the engines themselves take a chipset wakeref. This gives us a
transition on each engine which we can extend if we want to insert more
powermangement control (such as soft rc6). The global GEM operations
that currently require a struct_mutex are reduced to listening to pm
events from the chipset GT wakeref. As we reduce the struct_mutex
requirement, these listeners should evaporate.

Perhaps the biggest immediate change is that this removes the
struct_mutex requirement around GT power management, allowing us greater
flexibility in request construction. Another important knock-on effect,
is that by tracking engine usage, we can insert a switch back to the
kernel context on that engine immediately, avoiding any extra delay or
inserting global synchronisation barriers. This makes tracking when an
engine and its associated contexts are idle much easier -- important for
when we forgo our assumed execution ordering and need idle barriers to
unpin used contexts. In the process, it means we remove a large chunk of
code whose only purpose was to switch back to the kernel context.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk


# 112ed2d3 24-Apr-2019 Chris Wilson <chris@chris-wilson.co.uk>

drm/i915: Move GraphicsTechnology files under gt/

Start partitioning off the code that talks to the hardware (GT) from the
uapi layers and move the device facing code under gt/

One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to
subdivide that header and body further (and split out the submission
code from the ringbuffer and logical context handling). This patch aims
to be simple motion so git can fixup inflight patches with little mess.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk