#
5e4e06e4 |
|
30-Oct-2023 |
Andrzej Hajda <andrzej.hajda@intel.com> |
drm/i915: Track gt pm wakerefs Track every intel_gt_pm_get() until its corresponding release in intel_gt_pm_put() by returning a cookie to the caller for acquire that must be passed by on released. When there is an imbalance, we can see who either tried to free a stale wakeref, or who forgot to free theirs. v2: track recently added calls in gen8_ggtt_bind_get_ce and destroyed_worker_func Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231030-ref_tracker_i915-v1-2-006fe6b96421@intel.com
|
#
9c303439 |
|
28-Sep-2023 |
Mathias Krause <minipli@grsecurity.net> |
drm/i915: Clarify type evolution of uabi_node/uabi_engines Chaining user engines happens in multiple passes during driver initialization, mutating its type along the way. It starts off with a simple lock-less linked list (struct llist_node/head) populated by intel_engine_add_user() which later gets sorted and converted to an intermediate regular list (struct list_head) just to be converted once more to its final rb-tree structure (struct rb_node/root) in intel_engines_driver_register(). All of these types overlay the uabi_node/uabi_engines members which is unfortunate but safe if one takes care about using the rb-tree based structure only after the conversion has completed. However, mistakes happen and commit 1ec23ed7126e ("drm/i915: Use uabi engines for the default engine map") violated that assumption, as the multiple type evolution was all to easy hidden behind casts papering over it. Make the type evolution of uabi_node/uabi_engines more visible by putting all members into an anonymous union and use the correctly typed member in its various users. This allows us to drop quite some ugly casts and, hopefully, make the evolution of the members better recognisable to avoid future mistakes. Signed-off-by: Mathias Krause <minipli@grsecurity.net> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230928182019.10256-3-minipli@grsecurity.net
|
#
b3527499 |
|
26-Sep-2023 |
Nirmoy Das <nirmoy.das@intel.com> |
drm/i915: Create a kernel context for GGTT updates Create a separate kernel context if a platform requires GGTT updates using MI_UPDATE_GTT blitter command. Subsequent patch will introduce methods to update GGTT using this bind context and MI_UPDATE_GTT blitter command. v2: fix context leak on err(Oak) v3: improve err handling and improve function names(Andi) add docs for few functions. Signed-off-by: Nirmoy Das <nirmoy.das@intel.com> Reviewed-by: Oak Zeng <oak.zeng@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230926083742.14740-3-nirmoy.das@intel.com
|
#
28041067 |
|
21-Aug-2023 |
Andrzej Hajda <andrzej.hajda@intel.com> |
drm/i915: mark requests for GuC virtual engines to avoid use-after-free References to i915_requests may be trapped by userspace inside a sync_file or dmabuf (dma-resv) and held indefinitely across different proceses. To counter-act the memory leaks, we try to not to keep references from the request past their completion. On the other side on fence release we need to know if rq->engine is valid and points to hw engine (true for non-virtual requests). To make it possible extra bit has been added to rq->execution_mask, for marking virtual engines. Fixes: bcb9aa45d5a0 ("Revert "drm/i915: Hold reference to intel_context over life of i915_request"") Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230821153035.3903006-1-andrzej.hajda@intel.com
|
#
5eefc530 |
|
21-Aug-2023 |
Andrzej Hajda <andrzej.hajda@intel.com> |
drm/i915: mark requests for GuC virtual engines to avoid use-after-free References to i915_requests may be trapped by userspace inside a sync_file or dmabuf (dma-resv) and held indefinitely across different proceses. To counter-act the memory leaks, we try to not to keep references from the request past their completion. On the other side on fence release we need to know if rq->engine is valid and points to hw engine (true for non-virtual requests). To make it possible extra bit has been added to rq->execution_mask, for marking virtual engines. Fixes: bcb9aa45d5a0 ("Revert "drm/i915: Hold reference to intel_context over life of i915_request"") Signed-off-by: Chris Wilson <chris.p.wilson@linux.intel.com> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230821153035.3903006-1-andrzej.hajda@intel.com (cherry picked from commit 280410677af763f3871b93e794a199cfcf6fb580) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
1f5cf999 |
|
02-May-2023 |
Jani Nikula <jani.nikula@intel.com> |
drm/i915/engine: hide preempt_hang selftest member from kernel-doc drivers/gpu/drm/i915/gt/intel_engine_types.h:293: warning: Function parameter or member 'preempt_hang' not described in 'intel_engine_execlists' Signed-off-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/dafd771bb75cf14965dd3b666987c58a438de134.1683041799.git.jani.nikula@intel.com
|
#
5f284e9c |
|
23-Mar-2023 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
drm/i915/perf: Group engines into respective OA groups Now that we may have multiple OA units in a single GT as well as on separate GTs, create an engine group that maps to a single OA unit. v2: (Jani) - Drop warning on ENOMEM - Reorder patch in the series v3: (Ashutosh) - Remove unused members from perf structs - Update comments - Update engine_supports_oa check - Just return 1 in num_perf_groups_per_gt for now - Set engine->oa_group to NULL to begin with v4: Use engine_supports_oa() check in oa_init_reg_state (Ashutosh) v5: Rebase after dropping engine_supports_oa helper Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-5-umesh.nerlige.ramappa@intel.com
|
#
1008266e |
|
16-Feb-2023 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Consolidate TLB invalidation flow As the logic for selecting the register and corresponsing values grew, the code become a bit unsightly. Consolidate by storing the required values at engine init time in the engine itself, and by doing so minimise the amount of invariant platform and engine checks during each and every TLB invalidation. v2: * Fail engine probe if TLB invlidations registers are unknown. v3: * Rebase. v4: * Fix handling of GEN8_M2TCR. (Andrzej) v5: * Tidy checkpatch warnings. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andrzej Hajda <andrzej.hajda@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> # v1 Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230216092123.159085-1-tvrtko.ursulin@linux.intel.com
|
#
5fd974d1 |
|
02-Nov-2022 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915/mtl: add initial definitions for GSC CS Starting on MTL, the GSC is no longer managed with direct MMIO access, but we instead have a dedicated command streamer for it. As a first step for adding support for this CS, add the required definitions. Note that, although it is now a CS, the GSC retains its old class:instance value (OTHER_CLASS instance 6) Bspec: 65308, 45605 Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221102171047.2787951-2-daniele.ceraolospurio@intel.com
|
#
107ba1a2 |
|
21-Sep-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Restrict forced preemption to the active context When we submit a new pair of contexts to ELSP for execution, we start a timer by which point we expect the HW to have switched execution to the pending contexts. If the promotion to the new pair of contexts has not occurred, we declare the executing context to have hung and force the preemption to take place by resetting the engine and resubmitting the new contexts. This can lead to an unfair situation where almost all of the preemption timeout is consumed by the first context which just switches into the second context immediately prior to the timer firing and triggering the preemption reset (assuming that the timer interrupts before we process the CS events for the context switch). The second context hasn't yet had a chance to yield to the incoming ELSP (and send the ACk for the promotion) and so ends up being blamed for the reset. If we see that a context switch has occurred since setting the preemption timeout, but have not yet received the ACK for the ELSP promotion, rearm the preemption timer and check again. This is especially significant if the first context was not schedulable and so we used the shortest timer possible, greatly increasing the chance of accidentally blaming the second innocent context. Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption") Fixes: d12acee84ffb ("drm/i915/execlists: Cancel banned contexts on schedule-out") Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Andrzej Hajda <andrzej.hajda@intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com
|
#
6ef7d362 |
|
21-Sep-2022 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Restrict forced preemption to the active context When we submit a new pair of contexts to ELSP for execution, we start a timer by which point we expect the HW to have switched execution to the pending contexts. If the promotion to the new pair of contexts has not occurred, we declare the executing context to have hung and force the preemption to take place by resetting the engine and resubmitting the new contexts. This can lead to an unfair situation where almost all of the preemption timeout is consumed by the first context which just switches into the second context immediately prior to the timer firing and triggering the preemption reset (assuming that the timer interrupts before we process the CS events for the context switch). The second context hasn't yet had a chance to yield to the incoming ELSP (and send the ACk for the promotion) and so ends up being blamed for the reset. If we see that a context switch has occurred since setting the preemption timeout, but have not yet received the ACK for the ELSP promotion, rearm the preemption timer and check again. This is especially significant if the first context was not schedulable and so we used the shortest timer possible, greatly increasing the chance of accidentally blaming the second innocent context. Fixes: 3a7a92aba8fb ("drm/i915/execlists: Force preemption") Fixes: d12acee84ffb ("drm/i915/execlists: Cancel banned contexts on schedule-out") Reported-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Andi Shyti <andi.shyti@linux.intel.com> Reviewed-by: Andrzej Hajda <andrzej.hajda@intel.com> Tested-by: Andrzej Hajda <andrzej.hajda@intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220921135258.1714873-1-andrzej.hajda@intel.com (cherry picked from commit 107ba1a2c705f4358f2602ec2f2fd821bb651f42) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
9a92732f |
|
01-Jul-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/gt: Add general DSS steering iterator to intel_gt_mcr Although all DSS belong to a single pool on Xe_HP platforms (i.e., they're not organized into slices from a topology point of view), we do still need to pass 'group' and 'instance' targets when steering register accesses to a specific instance of a per-DSS multicast register. The rules for how to determine group and instance IDs (which previously used legacy terms "slice" and "subslice") varies by platform. Some platforms determine steering by gslice membership, some platforms by cslice membership, and future platforms may have other rules. Since looping over each DSS and performing steered unicast register accesses is a relatively common pattern, let's add a dedicated iteration macro to handle this (and replace the platform-specific "instdone" loop we were using previously. This will avoid the calling code needing to figure out the details about how to obtain steering IDs for a specific DSS. Most of the places where we use this new loop are in the GPU errorstate code at the moment, but we do have some additional features coming in the future that will also need to loop over each DSS and steer some register accesses accordingly. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220701232006.1016135-1-matthew.d.roper@intel.com
|
#
69f8afdb |
|
05-May-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/pvc: Engine definitions for new copy engines This patch adds the basic definitions needed to support new copy engines. Also updating the cmd_info to accommodate new engines, as the engine id's of legacy engines have been changed. v2: - Add _BCS(n) definition, similar to other engines. (Tvrtko) - Add I915_MAX_BCS definition, similar to other engnes. (Prathap) - Move GVT change to avoid u16 overflow to its own patch. (Tvrtko) Original-author: CQ Tang Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Prathap Kumar Valsan <prathap.kumar.valsan@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: José Roberto de Souza <jose.souza@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220505213812.3979301-9-matthew.d.roper@intel.com
|
#
717f9bad |
|
15-Apr-2022 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/dg2: Enable Wa_14014475959 - RCS / CCS context exit There is bug in DG2 where if the CCS contexts switches out while the RCS is running it can cause memory corruption. To workaround this add an atomic to a memory address with a value 1 and semaphore wait to the same address for a value of 0. The GuC firmware is responsible for writing 0 to the memory address when it is safe for the context to switch out. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220415224025.3693037-6-umesh.nerlige.ramappa@intel.com
|
#
144ce0ac |
|
11-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/sseu: Don't overallocate subslice storage Xe_HP removed "slice" as a first-class unit in the hardware design. Instead we now have a single pool of subslices (which are now referred to as "DSS") that different hardware units have different ways of grouping ("compute slices," "geometry slices," etc.). For the purposes of topology representation, we treat Xe_HP-based platforms as having a single slice that contains all of the platform's DSS. There's no need to allocate storage space for (max legacy slices * max dss); let's update some of our macros to minimize the storage requirement for sseu topology. We'll also document some of the constants to make it a little bit more clear what they represent. v2: - s/LEGACY/HSW/ in macro names. (Lucas) - Rename MAX() to SSEU_MAX() to avoid any potential clashes with other definitions elsewhere. Unfortunately max()/max_t() from linux/minmax.h cannot be used in this context. (Lucas) Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220311225459.385515-1-matthew.d.roper@intel.com
|
#
239bbb2f |
|
10-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/gt: Remove GEN12_SFC_DONE_MAX from register defs header We shouldn't really be keeping track of how many SFC_DONE registers our platforms can have, but rather how many SFC hardware units there can be (each SFC unit will have one corresponding SFC_DONE register). So drop the stray GEN12_SFC_DONE_MAX definition we had in the register definition file and replace it with an I915_MAX_SFC that follows the pattern we use for other hardware units. Note that our hardware has a 2:1:1 ratio of VD:VE:SFC, and as far as we know that pattern should carry forward to future platforms, so we'll define it as #VCS/2. Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220311062835.163744-1-matthew.d.roper@intel.com
|
#
f9576e36 |
|
03-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: Support platforms with CCS engines but no RCS In the past we've always assumed that an RCS engine is present on every platform. However now that we have compute engines there may be platforms that have CCS engines but no RCS, or platforms that are designed to have both, but have the RCS engine fused off. Various engine-centric initialization that only needs to be done a single time for the group of RCS+CCS engines can't rely on being setup with the RCS now; instead we add a I915_ENGINE_FIRST_RENDER_COMPUTE flag that will be assigned to a single engine in the group; whichever engine has this flag will be responsible for some of the general setup (RCU_MODE programming, initialization of certain workarounds, etc.). Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220303223435.2793124-1-matthew.d.roper@intel.com
|
#
adfadb56 |
|
01-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: Define context scheduling attributes in lrc descriptor In Dual Context mode the EUs are shared between render and compute command streamers. The hardware provides a field in the lrc descriptor to indicate the prioritization of the thread dispatch associated to the corresponding context. The context priority is set to 'low' at creation time and relies on the existing context priority to set it to low/normal/high. Bspec: 46145, 46260 Original-author: Michel Thierry Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Prasad Nallani <prasad.nallani@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-8-matthew.d.roper@intel.com
|
#
c674c5b9 |
|
01-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: CCS should use RCS setup functions The compute engine handles the same commands the render engine can (except 3D pipeline), so it makes sense that CCS is more similar to RCS than non-render engines. The CCS context state (lrc) is also similar to the render one, so reuse it. Note that the compute engine has its own CTX_R_PWR_CLK_STATE register. In order to avoid having multiple RCS && CCS checks, add the following engine flag: - I915_ENGINE_HAS_RCS_REG_STATE - use the render (larger) reg state ctx. BSpec: 46260 Original-author: Michel Thierry Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-6-matthew.d.roper@intel.com
|
#
944823c9 |
|
01-Mar-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: Define compute class and engine Introduce a Compute Command Streamer (CCS), which has access to the media and GPGPU pipelines (but not the 3D pipeline). To begin with, define the compute class/engine common functions, based on the existing render ones. v2: - Add kerneldoc for drm_i915_gem_engine_class since we're adding a new element to it. (Daniel) - Make engine class <-> guc class converters use lookup tables to make it more clear/explicit how the IDs map. (Tvrtko) v3: - Don't update uapi for now; we'll just include the driver-internal changes for the time being. Bspec: 46167, 45544 Original-author: Michel Thierry Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vinay Belgaumkar <vinay.belgaumkar@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> #v1 Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-2-matthew.d.roper@intel.com
|
#
20cddfcc |
|
06-Dec-2021 |
Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> |
drm/i915/gt: Use hw_engine_masks as reset_domains We need a way to reset engines by their reset domains. This change sets up way to fetch reset domains of each engine globally. Changes since V1: - Use static reset domain array - Ville and Tvrtko - Use BUG_ON at appropriate place - Tvrtko Signed-off-by: Tejas Upadhyay <tejaskumarx.surendrakumar.upadhyay@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211206081026.4024401-1-tejaskumarx.surendrakumar.upadhyay@intel.com
|
#
77cdd054 |
|
26-Oct-2021 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
drm/i915/pmu: Connect engine busyness stats from GuC to pmu With GuC handling scheduling, i915 is not aware of the time that a context is scheduled in and out of the engine. Since i915 pmu relies on this info to provide engine busyness to the user, GuC shares this info with i915 for all engines using shared memory. For each engine, this info contains: - total busyness: total time that the context was running (total) - id: id of the running context (id) - start timestamp: timestamp when the context started running (start) At the time (now) of sampling the engine busyness, if the id is valid (!= ~0), and start is non-zero, then the context is considered to be active and the engine busyness is calculated using the below equation engine busyness = total + (now - start) All times are obtained from the gt clock base. For inactive contexts, engine busyness is just equal to the total. The start and total values provided by GuC are 32 bits and wrap around in a few minutes. Since perf pmu provides busyness as 64 bit monotonically increasing values, there is a need for this implementation to account for overflows and extend the time to 64 bits before returning busyness to the user. In order to do that, a worker runs periodically at frequency = 1/8th the time it takes for the timestamp to wrap. As an example, that would be once in 27 seconds for a gt clock frequency of 19.2 MHz. Note: There might be an over-accounting of busyness due to the fact that GuC may be updating the total and start values while kmd is reading them. (i.e kmd may read the updated total and the stale start). In such a case, user may see higher busyness value followed by smaller ones which would eventually catch up to the higher value. v2: (Tvrtko) - Include details in commit message - Move intel engine busyness function into execlist code - Use union inside engine->stats - Use natural type for ping delay jiffies - Drop active_work condition checks - Use for_each_engine if iterating all engines - Drop seq locking, use spinlock at GuC level to update engine stats - Document worker specific details v3: (Tvrtko/Umesh) - Demarcate GuC and execlist stat objects with comments - Document known over-accounting issue in commit - Provide a consistent view of GuC state - Add hooks to gt park/unpark for GuC busyness - Stop/start worker in gt park/unpark path - Drop inline - Move spinlock and worker inits to GuC initialization - Drop helpers that are called only once v4: (Tvrtko/Matt/Umesh) - Drop addressed opens from commit message - Get runtime pm in ping, remove from the park path - Use cancel_delayed_work_sync in disable_submission path - Update stats during reset prepare - Skip ping if reset in progress - Explicitly name execlists and GuC stats objects - Since disable_submission is called from many places, move resetting stats to intel_guc_submission_reset_prepare v5: (Tvrtko) - Add a trylock helper that does not sleep and synchronize PMU event callbacks and worker with gt reset v6: (CI BAT failures) - DUTs using execlist submission failed to boot since __gt_unpark is called during i915 load. This ends up calling the GuC busyness unpark hook and results in kick-starting an uninitialized worker. Let park/unpark hooks check if GuC submission has been initialized. - drop cant_sleep() from trylock helper since rcu_read_lock takes care of that. v7: (CI) Fix igt@i915_selftest@live@gt_engines - For GuC mode of submission the engine busyness is derived from gt time domain. Use gt time elapsed as reference in the selftest. - Increase busyness calculation to 10ms duration to ensure batch runs longer and falls within the busyness tolerances in selftest. v8: - Use ktime_get in selftest as before - intel_reset_trylock_no_wait results in a lockdep splat that is not trivial to fix since the PMU callback runs in irq context and the reset paths are tightly knit into the driver. The test that uncovers this is igt@perf_pmu@faulting-read. Drop intel_reset_trylock_no_wait, instead use the reset_count to synchronize with gt reset during pmu callback. For the ping, continue to use intel_reset_trylock since ping is not run in irq context. - GuC PM timestamp does not tick when GuC is idle. This can potentially result in wrong busyness values when a context is active on the engine, but GuC is idle. Use the RING TIMESTAMP as GPU timestamp to process the GuC busyness stats. This works since both GuC timestamp and RING timestamp are synced with the same clock. - The busyness stats may get updated after the batch starts running. This delay causes the busyness reported for 100us duration to fall below 95% in the selftest. The only option at this time is to wait for GuC busyness to change from idle to active before we sample busyness over a 100us period. Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-2-umesh.nerlige.ramappa@intel.com
|
#
344e6947 |
|
26-Oct-2021 |
Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> |
drm/i915/pmu: Add a name to the execlists stats In preparation for GuC pmu stats, add a name to the execlists stats structure so that it can be differentiated from the GuC stats. Signed-off-by: Umesh Nerlige Ramappa <umesh.nerlige.ramappa@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211027004821.66097-1-umesh.nerlige.ramappa@intel.com
|
#
4f3059dc |
|
14-Oct-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Add logical engine mapping Add logical engine mapping. This is required for split-frame, as workloads need to be placed on engines in a logically contiguous manner. v2: (Daniel Vetter) - Add kernel doc for new fields v3: (Tvrtko) - Update comment for new logical_mask field v4: (John Harrison) - Update comment for new logical_mask field Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211014172005.27155-6-matthew.brost@intel.com
|
#
1a839e01 |
|
05-Oct-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915: remove IS_ACTIVE When trying to bring IS_ACTIVE to linux/kconfig.h I thought it wouldn't provide much value just encapsulating it in a boolean context. So I also added the support for handling undefined macros as the IS_ENABLED() counterpart. However the feedback received from Masahiro Yamada was that it is too ugly, not providing much value. And just wrapping in a boolean context is too dumb - we could simply open code it. As detailed in commit babaab2f4738 ("drm/i915: Encapsulate kconfig constant values inside boolean predicates"), the IS_ACTIVE macro was added to workaround a compilation warning. However after checking again our current uses of IS_ACTIVE it turned out there is only 1 case in which it triggers a warning in clang (due -Wconstant-logical-operand) and 2 in smatch. All the others can simply use the shorter version, without wrapping it in any macro. So here I'm dialing all the way back to simply removing the macro. That single case hit by clang can be changed to make the constant come first, so it doesn't think it's mask: - if (context && CONFIG_DRM_I915_FENCE_TIMEOUT) + if (CONFIG_DRM_I915_FENCE_TIMEOUT && context) As talked with Dan Carpenter, that logic will be added in smatch as well, so it will also stop warning about it. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Reviewed-by: Masahiro Yamada <masahiroy@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211005171728.3147094-1-lucas.demarchi@intel.com
|
#
3e42cc61 |
|
22-Sep-2021 |
Thomas Hellström <thomas.hellstrom@linux.intel.com> |
drm/i915/gt: Register the migrate contexts with their engines Pinned contexts, like the migrate contexts need reset after resume since their context image may have been lost. Also the GuC needs to register pinned contexts. Add a list to struct intel_engine_cs where we add all pinned contexts on creation, and traverse that list at resume time to reset the pinned contexts. This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now, but proper LMEM backup / restore is needed for full suspend functionality. However, note that even with full LMEM backup / restore it may be desirable to keep the reset since backing up the migrate context images must happen using memcpy() after the migrate context has become inactive, and for performance- and other reasons we want to avoid memcpy() from LMEM. Also traverse the list at guc_init_lrc_mapping() calling guc_kernel_context_pin() for the pinned contexts, like is already done for the kernel context. v2: - Don't reset the contexts on each __engine_unpark() but rather at resume time (Chris Wilson). v3: - Reset contexts in the engine sanitize callback. (Chris Wilson) Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Brost Matthew <matthew.brost@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-6-thomas.hellstrom@linux.intel.com
|
#
89f2e7ab |
|
05-Aug-2021 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/dg2: Report INSTDONE_GEOM values in error state Xe_HPG adds some additional INSTDONE_GEOM debug registers; the Mesa team has indicated that having these reported in the error state would be useful for debugging GPU hangs. These registers are replicated per-DSS with gslice steering. Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matt Atwood <matthew.s.atwood@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210805163647.801064-4-matthew.d.roper@intel.com
|
#
fa9899da |
|
05-Aug-2021 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/xehp: Loop over all gslices for INSTDONE processing We no longer have traditional slices on Xe_HP platforms, but the INSTDONE registers are replicated according to gslice representation which is similar. We can mostly re-use the existing instdone code with just a few modifications: * Create an alternate instdone loop macro that will iterate over the flat DSS space, but still provide the gslice/dss steering values for compatibility with the legacy code. * We should allocate INSTDONE storage space according to the maximum number of gslices rather than the maximum number of legacy slices to ensure we have enough storage space to hold all of the values. XeHP design has 8 gslices, whereas older platforms never had more than 3 slices. Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210805163647.801064-3-matthew.d.roper@intel.com
|
#
3a4bfa09 |
|
26-Jul-2021 |
Rahul Kumar Singh <rahul.kumar.singh@intel.com> |
drm/i915/selftest: Fix workarounds selftest for GuC submission When GuC submission is enabled, the GuC controls engine resets. Rather than explicitly triggering a reset, the driver must submit a hanging context to GuC and wait for the reset to occur. Signed-off-by: Rahul Kumar Singh <rahul.kumar.singh@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-28-matthew.brost@intel.com
|
#
573ba126 |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Capture error state on context reset We receive notification of an engine reset from GuC at its completion. Meaning GuC has potentially cleared any HW state we may have been interested in capturing. GuC resumes scheduling on the engine post-reset, as the resets are meant to be transparent, further muddling our error state. There is ongoing work to define an API for a GuC debug state dump. The suggestion for now is to manually disable FW initiated resets in cases where debug state is needed. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-19-matthew.brost@intel.com
|
#
d1cee2d3 |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move active request tracking to a vfunc Move active request tracking to a backend vfunc rather than assuming all backends want to do this in the manner. In the of case execlists / ring submission the tracking is on the physical engine while with GuC submission it is on the context. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-8-matthew.brost@intel.com
|
#
a95d1160 |
|
26-Jul-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915/guc: Direct all breadcrumbs for a class to single breadcrumbs With GuC virtual engines the physical engine which a request executes and completes on isn't known to the i915. Therefore we can't attach a request to a physical engines breadcrumbs. To work around this we create a single breadcrumbs per engine class when using GuC submission and direct all physical engine interrupts to this breadcrumbs. v2: (John H) - Rework header file structure so intel_engine_mask_t can be in intel_engine_types.h Signed-off-by: Matthew Brost <matthew.brost@intel.com> CC: John Harrison <John.C.Harrison@Intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-6-matthew.brost@intel.com
|
#
96d3e0e1 |
|
26-Jul-2021 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/guc: Make hangcheck work with GuC virtual engines The serial number tracking of engines happens at the backend of request submission and was expecting to only be given physical engines. However, in GuC submission mode, the decomposition of virtual to physical engines does not happen in i915. Instead, requests are submitted to their virtual engine mask all the way through to the hardware (i.e. to GuC). This would mean that the heart beat code thinks the physical engines are idle due to the serial number not incrementing. Which in turns means hangcheck does not work for GuC virtual engines. This patch updates the tracking to decompose virtual engines into their physical constituents and tracks the request against each. This is not entirely accurate as the GuC will only be issuing the request to one physical engine. However, it is the best that i915 can do given that it has no knowledge of the GuC's scheduling decisions. Downside of this is that all physical engines constituting a GuC virtual engine will be periodically unparked (even during just a single context executing) in order to be pinged with a heartbeat request. However the power and performance cost of this is not expected to be measurable (due low frequency of heartbeat pulses) and it is considered an easier option than trying to make changes to GuC firmware. v2: (Tvrtko) - Update commit message - Have default behavior if no vfunc present Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210727002348.97202-3-matthew.brost@intel.com
|
#
816753c0 |
|
22-Jul-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gt: nuke gen6_hw_id This is only used by GRAPHICS_VER == 6 and GRAPHICS_VER == 7. All other recent platforms do not depend on this field, so it doesn't make much sense to keep it generic like that. Instead, just do a mapping from engine class to HW ID in the single place that is needed. v2: use macros with the direct register address instead of calculating from the legacy HW_ID (Matt Roper) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210723002551.3906535-1-lucas.demarchi@intel.com
|
#
938c778f |
|
23-Jul-2021 |
John Harrison <John.C.Harrison@Intel.com> |
drm/i915/xehp: Extra media engines - Part 1 (engine definitions) Xe_HP can have a lot of extra media engines. This patch adds the basic definitions for them. v2: - Re-order intel_gt_info and intel_device_info slightly to avoid unnecessary padding now that we've increased the size of intel_engine_mask_t. (Tvrtko) v3: - Drop the .hw_id assignments. (Lucas) v4: - Fix graphics_ver typo for VCS4 (should be 12, not 11). (Lucas) Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Signed-off-by: Tomas Winkler <tomas.winkler@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210723191024.1553405-1-matthew.d.roper@intel.com
|
#
265b5ee0 |
|
20-Jul-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gt: rename legacy engine->hw_id to engine->gen6_hw_id We kept adding new engines and for that increasing hw_id unnecessarily: it's not used since GRAPHICS_VER == 8. Prepend "gen6" to the field and try to pack it in the structs to give a hint this field is actually not used in recent platforms. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210720232014.3302645-4-lucas.demarchi@intel.com
|
#
f9be3000 |
|
20-Jul-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gt: nuke unused legacy engine hw_id The engine hw_id is only used by RING_FAULT_REG(), which is not used by GRAPHICS_VER >= 8. We did use hw_id on recent platforms to set the engine's guc_id, but that is not the case anymore since commit c784e5249e77 ("drm/i915/guc: Update to use firmware v49.0.1"): now we only use class and id information to generate guc_id. We tend to keep adding new defines just to be consistent, but let's try to remove them and let them defined to 0 for engines that only exist on gen8+ platforms. v2: Reword commit message and add information about when we stopped using hw_id (Matt Roper) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210720232014.3302645-3-lucas.demarchi@intel.com
|
#
dd4f1bba |
|
08-Jul-2021 |
Jason Ekstrand <jason@jlekstrand.net> |
drm/i915/gem: Remove engine auto-magic with FENCE_SUBMIT (v2) Even though FENCE_SUBMIT is only documented to wait until the request in the in-fence starts instead of waiting until it completes, it has a bit more magic than that. If FENCE_SUBMIT is used to submit something to a balanced engine, we would wait to assign engines until the primary request was ready to start and then attempt to assign it to a different engine than the primary. There is an IGT test (the bonded-slice subtest of gem_exec_balancer) which exercises this by submitting a primary batch to a specific VCS and then using FENCE_SUBMIT to submit a secondary which can run on any VCS and have i915 figure out which VCS to run it on such that they can run in parallel. However, this functionality has never been used in the real world. The media driver (the only user of FENCE_SUBMIT) always picks exactly two physical engines to bond and never asks us to pick which to use. v2 (Daniel Vetter): - Mention the exact IGT test this breaks Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210708154835.528166-11-jason@jlekstrand.net
|
#
22916bad |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move submission tasklet to i915_sched_engine The submission tasklet operates on i915_sched_engine, thus it is the correct place for it. v3: (Jason Ekstrand) Change sched_engine->engine to a void* private data pointer Add kernel doc v4: (Daniele) Update private_data comment Set queue_priority_hint in kick_execlists v5: (CI) Rebase and fix build error Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-9-matthew.brost@intel.com
|
#
3f623e06 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move engine->schedule to i915_sched_engine The schedule function should be in the schedule object. v3: (Jason Ekstrand) Add kernel doc Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-6-matthew.brost@intel.com
|
#
349a2bc5 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move active tracking to i915_sched_engine Move active request tracking and its lock to i915_sched_engine. This lock is also the submission lock so having it in the i915_sched_engine is the correct place. v3: (Jason Ekstrand) Add kernel doc v6: Rebase Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.comk> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-5-matthew.brost@intel.com
|
#
3e28d371 |
|
17-Jun-2021 |
Matthew Brost <matthew.brost@intel.com> |
drm/i915: Move priolist to new i915_sched_engine object Introduce i915_sched_engine object which is lower level data structure that i915_scheduler / generic code can operate on without touching execlist specific structures. This allows additional submission backends to be added without breaking the layering. Currently the execlists backend uses 1 of these object per each engine (physical or virtual) but future backends like the GuC will point to less instances utilizing the reference counting. This is a bit of detour to integrating the i915 with the DRM scheduler but this object will still exist when the DRM scheduler lands in the i915. It will however look a bit different. It will encapsulate the drm_gpu_scheduler object plus and common variables (to the backends) related to scheduling. Regardless this is a step in the right direction. This patch starts the aforementioned transition by moving the priolist into the i915_sched_engine object. v3: (Jason Ekstrand) Update comment next to intel_engine_cs.virtual Add kernel doc (Checkpatch) Fix double the in commit message v4: (Daniele) Update comment message. Add comment about subclass field Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210618010638.98941-2-matthew.brost@intel.com
|
#
fa20cbdd |
|
05-Jun-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gt: Add remaining conversions to GRAPHICS_VER For some reason coccinelle misses a few cases in gt with calls to INTEL_GEN()/IS_GEN(). Do a manual conversion for those. Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-3-lucas.demarchi@intel.com
|
#
0669a6e1 |
|
21-May-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move CS interrupt handler to the backend The different submission backends each have their own preferred behaviour and interrupt setup. Let each handle their own interrupts. This becomes more useful later as we to extract the use of auxiliary state in the interrupt handler that is backend specific. Signed-off-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210521183215.65451-4-matthew.brost@intel.com
|
#
24f90d66 |
|
22-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: SPDX cleanup Clean up the SPDX licence declarations to comply with checkpatch. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210122192913.4518-1-chris@chris-wilson.co.uk Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
|
#
f530a41d |
|
15-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Convert stats.active to plain unsigned int As context-in/out is now always serialised, we do not have to worry about concurrent enabling/disable of the busy-stats and can reduce the atomic_t active to a plain unsigned int, and the seqlock to a seqcount. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-3-chris@chris-wilson.co.uk
|
#
2c421896 |
|
15-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Drop atomic for engine->fw_active tracking Since schedule-in/out is now entirely serialised by the tasklet bitlock, we do not need to worry about concurrent in/out operations and so reduce the atomic operations to plain instructions. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210115142331.24458-1-chris@chris-wilson.co.uk
|
#
4386b8e5 |
|
07-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Remove timeslice suppression In the next^W future patch, we remove the strict priority system and continuously re-evaluate the relative priority of tasks. As such we need to enable the timeslice whenever there is more than one context in the pipeline. This simplifies the decision and removes some of the tweaks to suppress timeslicing, allowing us to lift the timeslice enabling to a common spot at the end of running the submission tasklet. One consequence of the suppression is that it was reducing fairness between virtual engines on an over saturated system; undermining the principle for timeslicing. v2: Commentary v3: Commentary for the right cancel_timer() v4: Add tracing for why we need a timeslice Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2802 Testcase: igt/gem_exec_balancer/fairslice Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210107132322.28373-1-chris@chris-wilson.co.uk
|
#
0a7d355e |
|
04-Jan-2021 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Allow failed resets without assertion If the engine reset fails, we will attempt to resume with the current inflight submissions. When that happens, we cannot assert that the engine reset cleared the pending submission, so do not. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2878 Fixes: 16f2941ad307 ("drm/i915/gt: Replace direct submit with direct call to tasklet") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210104115145.24460-3-chris@chris-wilson.co.uk
|
#
16f2941a |
|
24-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Replace direct submit with direct call to tasklet Rather than having special case code for opportunistically calling process_csb() and performing a direct submit while holding the engine spinlock for submitting the request, simply call the tasklet directly. This allows us to retain the direct submission path, including the CS draining to allow fast/immediate submissions, without requiring any duplicated code paths, and most importantly greatly simplifying the control flow by removing reentrancy. This will enable us to close a few races in the virtual engines in the next few patches. The trickiest part here is to ensure that paired operations (such as schedule_in/schedule_out) remain under consistent locking domains, e.g. when pulled outside of the engine->active.lock v2: Use bh kicking, see commit 3c53776e29f8 ("Mark HI and TASKLET softirq synchronous"). v3: Update engine-reset to be tasklet aware Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201224135544.1713-1-chris@chris-wilson.co.uk
|
#
b436a5f8 |
|
22-Dec-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Track all timelines created using the HWSP We assume that the contents of the HWSP are lost across suspend, and so upon resume we must restore critical values such as the timeline seqno. Keep track of every timeline allocated that uses the HWSP as its storage and so we can then reset all seqno values by walking that list. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201222104242.10993-1-chris@chris-wilson.co.uk
|
#
ca05277e |
|
15-Sep-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Widen CSB pointer to u64 for the parsers A CSB entry is 64b, and it is simpler for us to treat it as an array of 64b entries than as an array of pairs of 32b entries. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-1-chris@chris-wilson.co.uk (cherry picked from commit f24a44e52fbc9881fc5f3bcef536831a15a439f3) (cherry picked from commit 3d4dbe0e0f0d04ebcea917b7279586817da8cf46) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
f24a44e5 |
|
15-Sep-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Widen CSB pointer to u64 for the parsers A CSB entry is 64b, and it is simpler for us to treat it as an array of 64b entries than as an array of pairs of 32b entries. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-1-chris@chris-wilson.co.uk
|
#
b3786b29 |
|
31-Jul-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Distinguish the virtual breadcrumbs from the irq breadcrumbs On the virtual engines, we only use the intel_breadcrumbs for tracking signaling of stale breadcrumbs from the irq_workers. They do not have any associated interrupt handling, active requests are passed to a physical engine and associated breadcrumb interrupt handler. This causes issues for us as we need to ensure that we do not actually try and enable interrupts and the powermanagement required for them on the virtual engine, as they will never be disabled. Instead, let's specify the physical engine used for interrupt handler on a particular breadcrumb. v2: Drop b->irq_armed = true mocking for no interrupt HW Fixes: 4fe6abb8f513 ("drm/i915/gt: Ignore irq enabling on the virtual engines") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200731154834.8378-4-chris@chris-wilson.co.uk Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
b2295e2e |
|
10-Jul-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Be defensive in the face of false CS events If the HW throws a curve ball and reports either en event before it is possible, or just a completely impossible event, we have to grin and bear it. The first few events, we will likely not notice as we would be expecting some event, but as soon as we stop expecting an event and yet they still keep coming, then we enter into undefined state territory. In which case, bail out, stop processing the events, and reset the engine and our set of queued requests to recover. The sporadic hangs and warnings will continue to plague CI, but at least system stability should not be compromised. v2: Commentary and force the reset-on-error. v3: Customised user facing message for forced resets from internal errors. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200710133125.30194-1-chris@chris-wilson.co.uk
|
#
aab4707f |
|
02-Jul-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Harden the heartbeat against a stuck driver If the driver gets stuck holding the kernel timeline, we cannot issue a heartbeat and so fail to discover that the driver is indeed stuck and do not issue a GPU reset (which would hopefully unstick the driver!). Switch to using a trylock so that we can query if the heartbeat's timeline mutex is locked elsewhere, and then use the timer to probe if it remains stuck at the same spot for consecutive heartbeats, indicating that the mutex has not been released and the engine has not progressed. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200702095219.963-1-chris@chris-wilson.co.uk
|
#
ac533c56 |
|
04-Jun-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Track if an engine requires forcewake w/a Sometimes an engine might need to keep forcewake active while it is busy submitting requests for a particular workaround. Track such nuisance with engine->fw_domain. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Venkata Sandeep Dhanalakota <venkata.s.dhanalakota@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200604153145.21068-1-chris@chris-wilson.co.uk
|
#
0f4013fb2 |
|
13-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Transfer old virtual breadcrumbs to irq_worker The second try at staging the transfer of the breadcrumb. In part one, we realised we could not simply move to the second engine as we were only holding the breadcrumb lock on the first. So in commit 6c81e21a4742 ("drm/i915/gt: Stage the transfer of the virtual breadcrumb"), we removed it from the first engine and marked up this request to reattach the signaling on the new engine. However, this failed to take into account that we only attach the breadcrumb if the new request is added at the start of the queue, which if we are transferring, it is because we know there to be a request to be signaled (and hence we would not be attached). In this attempt, we try to transfer the completed requests to the irq_worker on its rq->engine->breadcrumbs. This preserves the coupling between the rq and its breadcrumbs, so that i915_request_cancel_breadcrumb() does not attempt to manipulate the list under the wrong lock. v2: Code sharing is fun. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1862 Fixes: 6c81e21a4742 ("drm/i915/gt: Stage the transfer of the virtual breadcrumb") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200513074809.18194-1-chris@chris-wilson.co.uk
|
#
7a0ba6b4 |
|
14-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Show per-engine default property values in sysfs By providing the default values configured into the kernel via sysfs, it is much more convenient for userspace to restore those sane defaults, or at least know what are considered good baseline. This is useful, for example, to cleanup after any failed userspace prior to commencing new jobs. /sys/class/drm/card0/engine/rcs0/ ├── capabilities ├── class ├── .defaults │ ├── heartbeat_interval_ms │ ├── max_busywait_duration_ns │ ├── preempt_timeout_ms │ ├── stop_timeout_ms │ └── timeslice_duration_ms ├── heartbeat_interval_ms ├── instance ├── known_capabilities ├── max_busywait_duration_ns ├── mmio_base ├── name ├── preempt_timeout_ms ├── stop_timeout_ms └── timeslice_duration_ms Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Maciej Patelczyk <maciej.patelczyk@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200514062905.28668-1-chris@chris-wilson.co.uk
|
#
1bc6a601 |
|
28-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Track inflight CCID The presumption is that by using a circular counter that is twice as large as the maximum ELSP submission, we would never reuse the same CCID for two inflight contexts. However, if we continually preempt an active context such that it always remains inflight, it can be resubmitted with an arbitrary number of paired contexts. As each of its paired contexts will use a new CCID, eventually it will wrap and submit two ELSP with the same CCID. Rather than use a simple circular counter, switch over to a small bitmap of inflight ids so we can avoid reusing one that is still potentially active. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk (cherry picked from commit 5c4a53e3b1cbc38d0906e382f1037290658759bb) (cherry picked from commit 134711240307d5586ae8e828d2699db70a8b74f2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
53b2622e |
|
28-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Avoid reusing the same logical CCID The bspec is confusing on the nature of the upper 32bits of the LRC descriptor. Once upon a time, it said that it uses the upper 32b to decide if it should perform a lite-restore, and so we must ensure that each unique context submitted to HW is given a unique CCID [for the duration of it being on the HW]. Currently, this is achieved by using a small circular tag, and assigning every context submitted to HW a new id. However, this tag is being cleared on repinning an inflight context such that we end up re-using the 0 tag for multiple contexts. To avoid accidentally clearing the CCID in the upper 32bits of the LRC descriptor, split the descriptor into two dwords so we can update the GGTT address separately from the CCID. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk (cherry picked from commit 2632f174a2e1a5fd40a70404fa8ccfd0b1f79ebd) (cherry picked from commit a4b70fcc587860f4b972f68217d8ebebe295ec15) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
220dcfc1 |
|
07-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore If we find ourselves waiting on a MI_SEMAPHORE_WAIT, either within the user batch or in our own preamble, the engine raises a GT_WAIT_ON_SEMAPHORE interrupt. We can unmask that interrupt and so respond to a semaphore wait by yielding the timeslice, if we have another context to yield to! The only real complication is that the interrupt is only generated for the start of the semaphore wait, and is asynchronous to our process_csb() -- that is, we may not have registered the timeslice before we see the interrupt. To ensure we don't miss a potential semaphore blocking forward progress (e.g. selftests/live_timeslice_preempt) we mark the interrupt and apply it to the next timeslice regardless of whether it was active at the time. v2: We use semaphores in preempt-to-busy, within the timeslicing implementation itself! Ergo, when we do insert a preemption due to an expired timeslice, the new context may start with the missed semaphore flagged by the retired context and be yielded, ad infinitum. To avoid this, read the context id at the time of the semaphore interrupt and only yield if that context is still active. Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407130811.17321-1-chris@chris-wilson.co.uk (cherry picked from commit c4e8ba7390346a77ffe33ec3f210bc62e0b6c8c6) (cherry picked from commit cd60e4ac4738a6921592c4f7baf87f9a3499f0e2) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
977253df |
|
04-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Stop holding onto the pinned_default_state As we only restore the default context state upon banning a context, we only need enough of the state to run the ring and nothing more. That is we only need our bare protocontext. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200504180745.15645-1-chris@chris-wilson.co.uk
|
#
b68be5c6 |
|
05-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Record the active CCID from before reset If we cannot trust the reset will flush out the CS event queue such that process_csb() reports an accurate view of HW, we will need to search the active and pending contexts to determine which was actually running at the time we issued the reset. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200505084629.31365-1-chris@chris-wilson.co.uk
|
#
fe5a7082 |
|
01-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Make timeslicing an explicit engine property In order to allow userspace to rely on timeslicing to reorder their batches, we must support preemption of those user batches. Declare timeslicing as an explicit property that is a combination of having the kernel support and HW support. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk (cherry picked from commit a211da9c771bf97395a3ced83a3aa383372b13a7) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
#
a211da9c |
|
01-May-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Make timeslicing an explicit engine property In order to allow userspace to rely on timeslicing to reorder their batches, we must support preemption of those user batches. Declare timeslicing as an explicit property that is a combination of having the kernel support and HW support. Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200501122249.12417-1-chris@chris-wilson.co.uk
|
#
16e87459 |
|
29-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move the batch buffer pool from the engine to the gt Since the introduction of 'soft-rc6', we aim to park the device quickly and that results in frequent idling of the whole device. Currently upon idling we free the batch buffer pool, and so this renders the cache ineffective for many workloads. If we want to have an effective cache of recently allocated buffers available for reuse, we need to decouple that cache from the engine powermanagement and make it timer based. As there is no reason then to keep it within the engine (where it once made retirement order easier to track), we can move it up the hierarchy to the owner of the memory allocations. v2: Hook up to debugfs/drop_caches to clear the cache on demand. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200430111819.10262-2-chris@chris-wilson.co.uk
|
#
36d516be |
|
29-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Switch to manual evaluation of RPS As with the realisation for soft-rc6, we respond to idling the engines within microseconds, far faster than the response times for HW RC6 and RPS. Furthermore, our fast parking upon idle, prevents HW RPS from running for many desktop workloads, as the RPS evaluation intervals are on the order of tens of milliseconds, but the typical workload is just a couple of milliseconds, but yet we still need to determine the best frequency for user latency versus power. Recognising that the HW evaluation intervals are a poor fit, and that they were deprecated [in bspec at least] from gen10, start to wean ourselves off them and replace the EI with a timer and our accurate busy-stats. The principle benefit of manually evaluating RPS intervals is that we can be more responsive for better performance and powersaving for both spiky workloads and steady-state. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1698 Fixes: 98479ada421a ("drm/i915/gt: Treat idling as a RPS downclock event") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Andi Shyti <andi.shyti@intel.com> Reviewed-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-4-chris@chris-wilson.co.uk
|
#
426d0073 |
|
29-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Always enable busy-stats for execlists In the near future, we will utilize the busy-stats on each engine to approximate the C0 cycles of each, and use that as an input to a manual RPS mechanism. That entails having busy-stats always enabled and so we can remove the enable/disable routines and simplify the pmu setup. As a consequence of always having the stats enabled, we can also show the current active time via sysfs/engine/xcs/active_time_ns. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk
|
#
be1cb55a |
|
29-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Keep a no-frills swappable copy of the default context state We need to keep the default context state around to instantiate new contexts (aka golden rendercontext), and we also keep it pinned while the engine is active so that we can quickly reset a hanging context. However, the default contexts are large enough to merit keeping in swappable memory as opposed to kernel memory, so we store them inside shmemfs. Currently, we use the normal GEM objects to create the default context image, but we can throw away all but the shmemfs file. This greatly simplifies the tricky power management code which wants to run underneath the normal GT locking, and we definitely do not want to use any high level objects that may appear to recurse back into the GT. Though perhaps the primary advantage of the complex GEM object is that we aggressively cache the mapping, but here we are recreating the vm_area everytime time we unpark. At the worst, we add a lightweight cache, but first find a microbenchmark that is impacted. Having started to create some utility functions to make working with shmemfs objects easier, we can start putting them to wider use, where GEM objects are overkill, such as storing persistent error state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Matthew Auld <matthew.auld@intel.com> Cc: Ramalingam C <ramalingam.c@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200429172429.6054-1-chris@chris-wilson.co.uk
|
#
5c4a53e3 |
|
28-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Track inflight CCID The presumption is that by using a circular counter that is twice as large as the maximum ELSP submission, we would never reuse the same CCID for two inflight contexts. However, if we continually preempt an active context such that it always remains inflight, it can be resubmitted with an arbitrary number of paired contexts. As each of its paired contexts will use a new CCID, eventually it will wrap and submit two ELSP with the same CCID. Rather than use a simple circular counter, switch over to a small bitmap of inflight ids so we can avoid reusing one that is still potentially active. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-2-chris@chris-wilson.co.uk
|
#
2632f174 |
|
28-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Avoid reusing the same logical CCID The bspec is confusing on the nature of the upper 32bits of the LRC descriptor. Once upon a time, it said that it uses the upper 32b to decide if it should perform a lite-restore, and so we must ensure that each unique context submitted to HW is given a unique CCID [for the duration of it being on the HW]. Currently, this is achieved by using a small circular tag, and assigning every context submitted to HW a new id. However, this tag is being cleared on repinning an inflight context such that we end up re-using the 0 tag for multiple contexts. To avoid accidentally clearing the CCID in the upper 32bits of the LRC descriptor, split the descriptor into two dwords so we can update the GGTT address separately from the CCID. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/1796 Fixes: 2935ed5339c4 ("drm/i915: Remove logical HW ID") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: <stable@vger.kernel.org> # v5.5+ Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200428184751.11257-1-chris@chris-wilson.co.uk
|
#
23122a4d |
|
15-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Scrub execlists state on resume Before we resume, we reset the HW so we restart from a known good state. However, as a part of the reset process, we drain our pending CS event queue -- and if we are resuming that does not correspond to internal state. On setup, we are scrubbing the CS pointers, but alas only on setup. Apply the sanitization not just to setup, but to all resumes. Reported-by: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Venkata Ramana Nayana <venkata.ramana.nayana@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200416114117.3460-1-chris@chris-wilson.co.uk
|
#
c4e8ba73 |
|
07-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Yield the timeslice if caught waiting on a user semaphore If we find ourselves waiting on a MI_SEMAPHORE_WAIT, either within the user batch or in our own preamble, the engine raises a GT_WAIT_ON_SEMAPHORE interrupt. We can unmask that interrupt and so respond to a semaphore wait by yielding the timeslice, if we have another context to yield to! The only real complication is that the interrupt is only generated for the start of the semaphore wait, and is asynchronous to our process_csb() -- that is, we may not have registered the timeslice before we see the interrupt. To ensure we don't miss a potential semaphore blocking forward progress (e.g. selftests/live_timeslice_preempt) we mark the interrupt and apply it to the next timeslice regardless of whether it was active at the time. v2: We use semaphores in preempt-to-busy, within the timeslicing implementation itself! Ergo, when we do insert a preemption due to an expired timeslice, the new context may start with the missed semaphore flagged by the retired context and be yielded, ad infinitum. To avoid this, read the context id at the time of the semaphore interrupt and only yield if that context is still active. Fixes: 8ee36e048c98 ("drm/i915/execlists: Minimalistic timeslicing") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200407130811.17321-1-chris@chris-wilson.co.uk
|
#
43acd651 |
|
02-Apr-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Keep a per-engine request pool Add a tiny per-engine request mempool so that we should always have a request available for powermanagement allocations from tricky contexts. This reserve is expected to be only used for kernel contexts when barriers must be emitted [almost] without fail. The main consumer for this reserved request is expected to be engine-pm, for which we know that there will always be at least the previous pm request that we can reuse under mempressure (so there should always be a spare request for engine_park()). This is an alternative to using a comparatively bulky mempool, which requires custom handling for both our reserved allocation requirement and to protect our TYPESAFE_BY_RCU slab cache. The advantage of mempool would be that it would allow us to keep a larger per-engine request pool. However, converting over to mempool is straightforward should the need arise. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-and-tested-by: Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200402184037.21630-1-chris@chris-wilson.co.uk
|
#
062444bb |
|
28-Feb-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Expose busywait duration to sysfs We busywait on an inflight request (one that is currently executing on HW, and so might complete quickly) prior to setting up an interrupt and sleeping. The trade off is that we keep an expensive CPU core busy in order to avoid wake up latency: where that trade off should lie is best left to the sysadmin. The busywait mechanism can be compiled out with ./scripts/config --set-val DRM_I915_SPIN_REQUEST 0 The maximum busywait duration can be adjusted per-engine using, /sys/class/drm/card?/engine/*/ms_busywait_duration_ns Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Steve Carbonari <steven.carbonari@intel.com> Tested-by: Steve Carbonari <steven.carbonari@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200228131716.3243616-4-chris@chris-wilson.co.uk
|
#
c3f1ed90 |
|
16-Jan-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Allow temporary suspension of inflight requests In order to support out-of-line error capture, we need to remove the active request from HW and put it to one side while a worker compresses and stores all the details associated with that request. (As that compression may take an arbitrary user-controlled amount of time, we want to let the engine continue running on other workloads while the hanging request is dumped.) Not only do we need to remove the active request, but we also have to remove its context and all requests that were dependent on it (both in flight, queued and future submission). Finally once the capture is complete, we need to be able to resubmit the request and its dependents and allow them to execute. v2: Replace stack recursion with a simple list. v3: Check all the parents, not just the first, when searching for a stuck ancestor! References: https://gitlab.freedesktop.org/drm/intel/issues/738 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk (cherry picked from commit 32ff621fd74496f0c33644125fb69ff175859b1f) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
b6560007 |
|
09-Feb-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/selftests: Drop live_preempt_hang live_preempt_hang's use of hang injection has been superseded by live_preempt_reset's use of an non-preemptible spinner. The latter does not require intrusive hacks into the code. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200209230838.361154-2-chris@chris-wilson.co.uk
|
#
f7043102 |
|
29-Jan-2020 |
Lionel Landwerlin <lionel.g.landwerlin@intel.com> |
drm/i915: add extra slice common debug registers Could be helpful for debugging purposes. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200129181638.1528150-1-lionel.g.landwerlin@intel.com
|
#
70a76a9b |
|
28-Jan-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Hook up CS_MASTER_ERROR_INTERRUPT Now that we have offline error capture and can reset an engine from inside an atomic context while also preserving the GPU state for post-mortem analysis, it is time to handle error interrupts thrown by the command parser. This provides a much, much faster mechanism for us to detect known problems than using heartbeats/hangchecks, and also provides a mechanism for when those are disabled. However, it is limited to problems the HW can detect in the CS and so not a complete solution for detecting lockups. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200128204318.4182039-2-chris@chris-wilson.co.uk
|
#
5eec7182 |
|
16-Jan-2020 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Align engine->uabi_class/instance with i915_drm.h In our ABI we have defined I915_ENGINE_CLASS_INVALID_NONE and I915_ENGINE_CLASS_INVALID_VIRTUAL as negative values which creates implicit coupling with type widths used in, also ABI, struct i915_engine_class_instance. One place where we export engine->uabi_class I915_ENGINE_CLASS_INVALID_VIRTUAL is from our our tracepoints. Because the type of the former is u8 in contrast to u16 defined in the ABI, 254 will be returned instead of 65534 which userspace would legitimately expect. Another place is I915_CONTEXT_PARAM_ENGINES. Therefore we need to align the type used to store engine ABI class and instance. v2: * Update the commit message mentioning get_engines and cc stable. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: <stable@vger.kernel.org> # v5.3+ Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200116134508.25211-1-tvrtko.ursulin@linux.intel.com (cherry picked from commit 0b3bd0cdc329a1e2e00995cffd61aacf58c87cb4) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
0b3bd0cd |
|
16-Jan-2020 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Align engine->uabi_class/instance with i915_drm.h In our ABI we have defined I915_ENGINE_CLASS_INVALID_NONE and I915_ENGINE_CLASS_INVALID_VIRTUAL as negative values which creates implicit coupling with type widths used in, also ABI, struct i915_engine_class_instance. One place where we export engine->uabi_class I915_ENGINE_CLASS_INVALID_VIRTUAL is from our our tracepoints. Because the type of the former is u8 in contrast to u16 defined in the ABI, 254 will be returned instead of 65534 which userspace would legitimately expect. Another place is I915_CONTEXT_PARAM_ENGINES. Therefore we need to align the type used to store engine ABI class and instance. v2: * Update the commit message mentioning get_engines and cc stable. (Chris) Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Fixes: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: <stable@vger.kernel.org> # v5.3+ Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20200116134508.25211-1-tvrtko.ursulin@linux.intel.com
|
#
32ff621f |
|
16-Jan-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Allow temporary suspension of inflight requests In order to support out-of-line error capture, we need to remove the active request from HW and put it to one side while a worker compresses and stores all the details associated with that request. (As that compression may take an arbitrary user-controlled amount of time, we want to let the engine continue running on other workloads while the hanging request is dumped.) Not only do we need to remove the active request, but we also have to remove its context and all requests that were dependent on it (both in flight, queued and future submission). Finally once the capture is complete, we need to be able to resubmit the request and its dependents and allow them to execute. v2: Replace stack recursion with a simple list. v3: Check all the parents, not just the first, when searching for a stuck ancestor! References: https://gitlab.freedesktop.org/drm/intel/issues/738 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200116184754.2860848-2-chris@chris-wilson.co.uk
|
#
e26b6d43 |
|
21-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Pull GT initialisation under intel_gt_init() Begin pulling the GT setup underneath a single GT umbrella; let intel_gt take ownership of its engines! As hinted, the complication is the lifetime of the probed engine versus the active lifetime of the GT backends. We need to detect the engine layout early and keep it until the end so that we can sanitize state on takeover and release. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Andi Shyti <andi.shyti@intel.com> Acked-by: Andi Shyti <andi.shyti@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191222120752.1368352-1-chris@chris-wilson.co.uk
|
#
b81e4d9b |
|
18-Dec-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Track engine round-trip times Knowing the round trip time of an engine is useful for tracking the health of the system as well as providing a metric for the baseline responsiveness of the engine. We can use the latter metric for automatically tuning our waits in selftests and when idling so we don't confuse a slower system with a dead one. Upon idling the engine, we send one last pulse to switch the context away from precious user state to the volatile kernel context. We know the engine is idle at this point, and the pulse is non-preemptible, so this provides us with a good measurement of the round trip time. It also provides us with faster engine parking for ringbuffer submission, which is a welcome bonus (e.g. softer-rc6). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191219105043.4169050-1-chris@chris-wilson.co.uk Link: https://patchwork.freedesktop.org/patch/msgid/20191219124353.8607-2-chris@chris-wilson.co.uk
|
#
31177017 |
|
25-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Schedule request retirement when timeline idles The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA") is that it disables RC6 while Skylake (and friends) is active, and we do not consider the GPU idle until all outstanding requests have been retired and the engine switched over to the kernel context. If userspace is idle, this task falls onto our background idle worker, which only runs roughly once a second, meaning that userspace has to have been idle for a couple of seconds before we enable RC6 again. Naturally, this causes us to consume considerably more energy than before as powersaving is effectively disabled while a display server (here's looking at you Xorg) is running. As execlists will get a completion event as each context is completed, we can use this interrupt to queue a retire worker bound to this engine to cleanup idle timelines. We will then immediately notice the idle engine (without userspace intervention or the aid of the background retire worker) and start parking the GPU. Thus during light workloads, we will do much more work to idle the GPU faster... Hopefully with commensurate power saving! v2: Watch context completions and only look at those local to the engine when retiring to reduce the amount of excess work we perform. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315 References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA") References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk (cherry picked from commit 4f88f8747fa43c97c3b3712d8d87295ea757cc51) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
#
4f88f874 |
|
25-Nov-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Schedule request retirement when timeline idles The major drawback of commit 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA") is that it disables RC6 while Skylake (and friends) is active, and we do not consider the GPU idle until all outstanding requests have been retired and the engine switched over to the kernel context. If userspace is idle, this task falls onto our background idle worker, which only runs roughly once a second, meaning that userspace has to have been idle for a couple of seconds before we enable RC6 again. Naturally, this causes us to consume considerably more energy than before as powersaving is effectively disabled while a display server (here's looking at you Xorg) is running. As execlists will get a completion event as each context is completed, we can use this interrupt to queue a retire worker bound to this engine to cleanup idle timelines. We will then immediately notice the idle engine (without userspace intervention or the aid of the background retire worker) and start parking the GPU. Thus during light workloads, we will do much more work to idle the GPU faster... Hopefully with commensurate power saving! v2: Watch context completions and only look at those local to the engine when retiring to reduce the amount of excess work we perform. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112315 References: 7e34f4e4aad3 ("drm/i915/gen8+: Add RC6 CTX corruption WA") References: 2248a28384fe ("drm/i915/gen8+: Add RC6 CTX corruption WA") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191125105858.1718307-3-chris@chris-wilson.co.uk
|
#
311a50e7 |
|
01-Aug-2018 |
Jon Bloomfield <jon.bloomfield@intel.com> |
drm/i915: Add support for mandatory cmdparsing The existing cmdparser for gen7 can be bypassed by specifying batch_len=0 in the execbuf call. This is safe because bypassing simply reduces the cmd-set available. In a later patch we will introduce cmdparsing for gen9, as a security measure, which must be strictly enforced since without it we are vulnerable to DoS attacks. Introduce the concept of 'required' cmd parsing that cannot be bypassed by submitting zero-length bb's. v2: rebase (Mika) v2: rebase (Mika) v3: fix conflict on engine flags (Mika) Signed-off-by: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Tony Luck <tony.luck@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Tyler Hicks <tyhicks@canonical.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris.p.wilson@intel.com>
|
#
b79029b2 |
|
29-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Make timeslice duration configurable Execlists uses a scheduling quantum (a timeslice) to alternate execution between ready-to-run contexts of equal priority. This ensures that all users (though only if they of equal importance) have the opportunity to run and prevents livelocks where contexts may have implicit ordering due to userspace semaphores. However, not all workloads necessarily benefit from timeslicing and in the extreme some sysadmin may want to disable or reduce the timeslicing granularity. The timeslicing mechanism can be compiled out^W^W disabled (but should DCE!) with ./scripts/config --set-val DRM_I915_TIMESLICE_DURATION 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191029091632.26281-1-chris@chris-wilson.co.uk
|
#
2871ea85 |
|
24-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Split intel_ring_submission Split the legacy submission backend from the common CS ring buffer handling. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191024100344.5041-1-chris@chris-wilson.co.uk
|
#
058179e7 |
|
23-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Replace hangcheck by heartbeats Replace sampling the engine state every so often with a periodic heartbeat request to measure the health of an engine. This is coupled with the forced-preemption to allow long running requests to survive so long as they do not block other users. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-5-chris@chris-wilson.co.uk
|
#
3a7a92ab |
|
23-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Force preemption If the preempted context takes too long to relinquish control, e.g. it is stuck inside a shader with arbitration disabled, evict that context with an engine reset. This ensures that preemptions are reasonably responsive, providing a tighter QoS for the more important context at the cost of flagging unresponsive contexts more frequently (i.e. instead of using an ~10s hangcheck, we now evict at ~100ms). The challenge of lies in picking a timeout that can be reasonably serviced by HW for typical workloads, balancing the existing clients against the needs for responsiveness. Note that coupled with timeslicing, this will lead to rapid GPU "hang" detection with multiple active contexts vying for GPU time. The forced preemption mechanism can be compiled out with ./scripts/config --set-val DRM_I915_PREEMPT_TIMEOUT 0 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-2-chris@chris-wilson.co.uk
|
#
a8c51ed2 |
|
23-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Try to more gracefully quiesce the system before resets If we are doing a normal GPU reset triggered after detecting a long period of stalled work, we can take our time and allow the engines to quiesce. Since we've stopped submission to the engine, and if we wait long enough an innocent context should complete, leaving the engine idle. So by waiting a short amount of time, we should prevent clobbering other users when resetting a stuck context. Suggested-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Suggested-by: Jon Bloomfield <jon.bloomfield@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191023133108.21401-1-chris@chris-wilson.co.uk
|
#
a50134b1 |
|
17-Oct-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Make for_each_engine_masked work on intel_gt Medium term goal is to eliminate the i915->engine[] array and to get there we have recently introduced equivalent array in intel_gt. Now we need to migrate the code further towards this state. This next step is to eliminate usage of i915->engines[] from the for_each_engine_masked iterator. For this to work we also need to use engine->id as index when populating the gt->engine[] array and adjust the default engine set indexing to use engine->legacy_idx instead of assuming gt->engines[] indexing. v2: * Populate gt->engine[] earlier. * Check that we don't duplicate engine->legacy_idx v3: * Work around the initialization order issue between default_engines() and intel_engines_driver_register() which sets engine->legacy_idx for now. It will be fixed properly later. v4: * Merge with forgotten v2.5. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20191017161852.8836-1-tvrtko.ursulin@linux.intel.com
|
#
2935ed53 |
|
04-Oct-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Remove logical HW ID With the introduction of ctx->engines[] we allow multiple logical contexts to be used on the same engine (e.g. with virtual engines). According to bspec, aach logical context requires a unique tag in order for context-switching to occur correctly between them. [Simple experiments show that it is not so easy to trick the HW into performing a lite-restore with matching logical IDs, though my memory from early Broadwell experiments do suggest that it should be generating lite-restores.] We only need to keep a unique tag for the active lifetime of the context, and for as long as we need to identify that context. The HW uses the tag to determine if it should use a lite-restore (why not the LRCA?) and passes the tag back for various status identifies. The only status we need to track is for OA, so when using perf, we assign the specific context a unique tag. v2: Calculate required number of tags to fill ELSP. Fixes: 976b55f0e1db ("drm/i915: Allow a context to define its set of engines") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111895 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191004134015.13204-14-chris@chris-wilson.co.uk
|
#
cdb736fa |
|
06-Sep-2019 |
Mika Kuoppala <mika.kuoppala@linux.intel.com> |
drm/i915: Use engine relative LRIs on context setup Daniele pointed out that relative mmio works differently in on context restore. Instead of adding the engine mmio base to offset, it masks out the base and adds bits [12:2] to current engine base. This should allow us to construct context register state to be applicable to all instances, including virtual. And avoid the trouble of updating the registers on virtual instances when submitting work. v2: only enable for gen12 for now (Mika) v3: make enabling readable (Chris) Bspec: 20206 Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Lucas De Marchi <lucas.demarchi@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Suggested-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190906134957.25909-1-mika.kuoppala@linux.intel.com
|
#
eaef5b3c |
|
23-Aug-2019 |
Stuart Summers <stuart.summers@intel.com> |
drm/i915: Refactor instdone loops on new subslice functions Refactor instdone loops to use the new intel_sseu_has_subslice function. Signed-off-by: Stuart Summers <stuart.summers@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190823160307.180813-10-stuart.summers@intel.com
|
#
3d7b3039 |
|
15-Aug-2019 |
Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> |
drm/i915: Move engine IDs out of i915_reg.h To remove the dependency between the GT headers and i915_reg.h, move the definition of the engine IDs/classes to intel_engine_types.h Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190816012343.36433-3-daniele.ceraolospurio@intel.com
|
#
df403069 |
|
16-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Lift process_csb() out of the irq-off spinlock If we only call process_csb() from the tasklet, though we lose the ability to bypass ksoftirqd interrupt processing on direct submission paths, we can push it out of the irq-off spinlock. The penalty is that we then allow schedule_out to be called concurrently with schedule_in requiring us to handle the usage count (baked into the pointer itself) atomically. As we do kick the tasklets (via local_bh_enable()) after our submission, there is a possibility there to see if we can pull the local softirq processing back from the ksoftirqd. v2: Store the 'switch_priority_hint' on submission, so that we can safely check during process_csb(). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190816171608.11760-1-chris@chris-wilson.co.uk
|
#
e5dadff4 |
|
15-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Protect request retirement with timeline->mutex Forgo the struct_mutex requirement for request retirement as we have been transitioning over to only using the timeline->mutex for controlling the lifetime of a request on that timeline. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190815205709.24285-4-chris@chris-wilson.co.uk
|
#
75d0a7f3 |
|
09-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Lift timeline into intel_context Move the timeline from being inside the intel_ring to intel_context itself. This saves much pointer dancing and makes the relations of the context to its timeline much clearer. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809182518.20486-4-chris@chris-wilson.co.uk
|
#
f1c4d157 |
|
07-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Fix up the inverse mapping for default ctx->engines[] The order in which we store the engines inside default_engines() for the legacy ctx->engines[] has to match the legacy I915_EXEC_RING selector mapping in execbuf::user_map. If we present VCS2 as being the second instance of the video engine, legacy userspace calls that I915_EXEC_BSD2 and so we need to insert it into the second video slot. v2: Record the legacy mapping (hopefully we can remove this need in the future) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111328 Fixes: 2edda80db3d0 ("drm/i915: Rename engines to match their user interface") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> #v1 Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190808110612.23539-2-chris@chris-wilson.co.uk
|
#
750e76b4 |
|
06-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Move the [class][inst] lookup for engines onto the GT To maintain a fast lookup from a GT centric irq handler, we want the engine lookup tables on the intel_gt. To avoid having multiple copies of the same multi-dimension lookup table, move the generic user engine lookup into an rbtree (for fast and flexible indexing). v2: Split uabi_instance cf uabi_class v3: Set uabi_class/uabi_instance after collating all engines to provide a stable uabi across parallel unordered construction. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> #v2 Link: https://patchwork.freedesktop.org/patch/msgid/20190806124300.24945-2-chris@chris-wilson.co.uk
|
#
b40d7378 |
|
04-Aug-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Replace struct_mutex for batch pool serialisation Switch to tracking activity via i915_active on individual nodes, only keeping a list of retired objects in the cache, and reaping the cache when the engine itself idles. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190804124826.30272-2-chris@chris-wilson.co.uk
|
#
a5627721 |
|
28-Jul-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Inline engine->init_context into its caller We only use the init_context vfunc once while recording the default context state, and we use the same sequence in each backend (eliding steps that do not apply). Remove the vfunc for simplicity and de-duplication. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190729113720.24830-1-chris@chris-wilson.co.uk
|
#
ac65bdfe |
|
19-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Keep rings pinned while the context is active Remember to keep the rings pinned as well as the context image until the GPU is no longer active. v2: Introduce a ring->pin_count primarily to hide the mock_ring that doesn't fit into the normal GGTT vma picture. v3: Order is important in teardown, ringbuffer submission needs to drop the pin count on the engine->kernel_context before it can gleefully free its ring. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110946 Fixes: ce476c80b8bf ("drm/i915: Keep contexts pinned until after the next kernel context switch") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190619170135.15281-1-chris@chris-wilson.co.uk (cherry picked from commit 09c5ab384f6fb30f834a5777888b4486dd7f015d) Signed-off-by: Jani Nikula <jani.nikula@intel.com>
|
#
71b0846c |
|
09-Jul-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/guc: Remove preemption support for current fw Preemption via GuC submission is not being supported with its current legacy incarnation. The current FW does support a similar pre-emption flow via H2G, but it is class-based instead of being instance-based, which doesn't fit well with the i915 tracking. To fix this, the firmware is being updated to better support our needs with a new flow, so we can safely remove the old code. v2 (Daniele): resurrect & rebase, reword commit message, remove preempt_context as well Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Acked-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Michał Winiarski <michal.winiarski@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190710005437.3496-2-daniele.ceraolospurio@intel.com
|
#
f0c02c1b |
|
21-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Rename i915_timeline to intel_timeline and move under gt Move all timeline code under gt and rename to intel_gt prefix. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Suggested-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-32-tvrtko.ursulin@linux.intel.com
|
#
f937f561 |
|
21-Jun-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Store backpointer to intel_gt in the engine It will come useful in the next patch. v2: * Do mock_engine as well. v3: * And the virtual engine... Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190621070811.7006-11-tvrtko.ursulin@linux.intel.com
|
#
8ee36e04 |
|
20-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Minimalistic timeslicing If we have multiple contexts of equal priority pending execution, activate a timer to demote the currently executing context in favour of the next in the queue when that timeslice expires. This enforces fairness between contexts (so long as they allow preemption -- forced preemption, in the future, will kick those who do not obey) and allows us to avoid userspace blocking forward progress with e.g. unbounded MI_SEMAPHORE_WAIT. For the starting point here, we use the jiffie as our timeslice so that we should be reasonably efficient wrt frequent CPU wakeups. Testcase: igt/gem_exec_scheduler/semaphore-resolve Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-2-chris@chris-wilson.co.uk
|
#
22b7a426 |
|
20-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Preempt-to-busy When using a global seqno, we required a precise stop-the-workd event to handle preemption and unwind the global seqno counter. To accomplish this, we would preempt to a special out-of-band context and wait for the machine to report that it was idle. Given an idle machine, we could very precisely see which requests had completed and which we needed to feed back into the run queue. However, now that we have scrapped the global seqno, we no longer need to precisely unwind the global counter and only track requests by their per-context seqno. This allows us to loosely unwind inflight requests while scheduling a preemption, with the enormous caveat that the requests we put back on the run queue are still _inflight_ (until the preemption request is complete). This makes request tracking much more messy, as at any point then we can see a completed request that we believe is not currently scheduled for execution. We also have to be careful not to rewind RING_TAIL past RING_HEAD on preempting to the running context, and for this we use a semaphore to prevent completion of the request before continuing. To accomplish this feat, we change how we track requests scheduled to the HW. Instead of appending our requests onto a single list as we submit, we track each submission to ELSP as its own block. Then upon receiving the CS preemption event, we promote the pending block to the inflight block (discarding what was previously being tracked). As normal CS completion events arrive, we then remove stale entries from the inflight tracker. v2: Be a tinge paranoid and ensure we flush the write into the HWS page for the GPU semaphore to pick in a timely fashion. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190620142052.19311-1-chris@chris-wilson.co.uk
|
#
09c5ab38 |
|
19-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Keep rings pinned while the context is active Remember to keep the rings pinned as well as the context image until the GPU is no longer active. v2: Introduce a ring->pin_count primarily to hide the mock_ring that doesn't fit into the normal GGTT vma picture. v3: Order is important in teardown, ringbuffer submission needs to drop the pin count on the engine->kernel_context before it can gleefully free its ring. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110946 Fixes: ce476c80b8bf ("drm/i915: Keep contexts pinned until after the next kernel context switch") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190619170135.15281-1-chris@chris-wilson.co.uk
|
#
44d89409 |
|
18-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Make the semaphore saturation mask global The idea behind keeping the saturation mask local to a context backfired spectacularly. The premise with the local mask was that we would be more proactive in attempting to use semaphores after each time the context idled, and that all new contexts would attempt to use semaphores ignoring the current state of the system. This turns out to be horribly optimistic. If the system state is still oversaturated and the existing workloads have all stopped using semaphores, the new workloads would attempt to use semaphores and be deprioritised behind real work. The new contexts would not switch off using semaphores until their initial batch of low priority work had completed. Given sufficient backload load of equal user priority, this would completely starve the new work of any GPU time. To compensate, remove the local tracking in favour of keeping it as global state on the engine -- once the system is saturated and semaphores are disabled, everyone stops attempting to use semaphores until the system is idle again. One of the reason for preferring local context tracking was that it worked with virtual engines, so for switching to global state we could either do a complete check of all the virtual siblings or simply disable semaphores for those requests. This takes the simpler approach of disabling semaphores on virtual engines. The downside is that the decision that the engine is saturated is a local measure -- we are only checking whether or not this context was scheduled in a timely fashion, it may be legitimately delayed due to user priorities. We still have the same dilemma though, that we do not want to employ the semaphore poll unless it will be used. v2: Explain why we need to assume the worst wrt virtual engines. Fixes: ca6e56f654e7 ("drm/i915: Disable semaphore busywaits on saturated systems") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Dmitry Ermilov <dmitry.ermilov@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190618074153.16055-8-chris@chris-wilson.co.uk
|
#
422d7df4 |
|
14-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Replace engine->timeline with a plain list To continue the onslaught of removing the assumption of a global execution ordering, another casualty is the engine->timeline. Without an actual timeline to track, it is overkill and we can replace it with a much less grand plain list. We still need a list of requests inflight, for the simple purpose of finding inflight requests (for retiring, resetting, preemption etc). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-3-chris@chris-wilson.co.uk
|
#
ce476c80 |
|
14-Jun-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Keep contexts pinned until after the next kernel context switch We need to keep the context image pinned in memory until after the GPU has finished writing into it. Since it continues to write as we signal the final breadcrumb, we need to keep it pinned until the request after it is complete. Currently we know the order in which requests execute on each engine, and so to remove that presumption we need to identify a request/context-switch we know must occur after our completion. Any request queued after the signal must imply a context switch, for simplicity we use a fresh request from the kernel context. The sequence of operations for keeping the context pinned until saved is: - On context activation, we preallocate a node for each physical engine the context may operate on. This is to avoid allocations during unpinning, which may be from inside FS_RECLAIM context (aka the shrinker) - On context deactivation on retirement of the last active request (which is before we know the context has been saved), we add the preallocated node onto a barrier list on each engine - On engine idling, we emit a switch to kernel context. When this switch completes, we know that all previous contexts must have been saved, and so on retiring this request we can finally unpin all the contexts that were marked as deactivated prior to the switch. We can enhance this in future by flushing all the idle contexts on a regular heartbeat pulse of a switch to kernel context, which will also be used to check for hung engines. v2: intel_context_active_acquire/_release Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190614164606.15633-1-chris@chris-wilson.co.uk
|
#
a10f361d |
|
29-May-2019 |
Jani Nikula <jani.nikula@intel.com> |
Revert "drm/i915: Expand subslice mask" This reverts commit 1ac159e23c2c ("drm/i915: Expand subslice mask"), which kills ICL due to GEM_BUG_ON() sanity checks before CI even gets a chance to do anything. The commit exposes an issue in commit 1e40d4aea57b ("drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads"), which will also need to be addressed. There's a proposed fix [1], but considering the seeming uncertainty with the fix as well as the size of the regressing commit (in this context, the one that actually brings down ICL), this warrants a revert to get ICL working, and gives us time to get all of this right without rushing. Even if this means shooting the messenger. <3>[ 9.426327] intel_sseu_get_subslices:46 GEM_BUG_ON(slice >= sseu->max_slices) <4>[ 9.426355] ------------[ cut here ]------------ <2>[ 9.426357] kernel BUG at drivers/gpu/drm/i915/gt/intel_sseu.c:46! <4>[ 9.426371] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI <4>[ 9.426377] CPU: 1 PID: 364 Comm: systemd-udevd Not tainted 5.2.0-rc2-CI-CI_DRM_6159+ #1 <4>[ 9.426385] Hardware name: Intel Corporation Ice Lake Client Platform/IceLake U DDR4 SODIMM PD RVP TLC, BIOS ICLSFWR1.R00.3183.A00.1905020411 05/02/2019 <4>[ 9.426444] RIP: 0010:intel_sseu_get_subslices+0x8a/0xe0 [i915] <4>[ 9.426452] Code: d5 76 b7 e0 48 8b 35 9d 24 21 00 49 c7 c0 07 f0 72 a0 b9 2e 00 00 00 48 c7 c2 00 8e 6d a0 48 c7 c7 a5 14 5b a0 e8 36 3c be e0 <0f> 0b 48 c7 c1 80 d5 6f a0 ba 30 00 00 00 48 c7 c6 00 8e 6d a0 48 <4>[ 9.426468] RSP: 0018:ffffc9000037b9c8 EFLAGS: 00010282 <4>[ 9.426475] RAX: 000000000000000f RBX: 0000000000000000 RCX: 0000000000000000 <4>[ 9.426482] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88849e346f98 <4>[ 9.426490] RBP: ffff88848a200000 R08: 0000000000000004 R09: ffff88849d50b000 <4>[ 9.426497] R10: 0000000000000000 R11: ffff88849e346f98 R12: ffff88848a209e78 <4>[ 9.426505] R13: 0000000003000000 R14: ffff88848a20b1a8 R15: 0000000000000000 <4>[ 9.426513] FS: 00007f73d5ae8680(0000) GS:ffff88849fc80000(0000) knlGS:0000000000000000 <4>[ 9.426521] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 9.426527] CR2: 0000561417b01260 CR3: 0000000494764003 CR4: 0000000000760ee0 <4>[ 9.426535] PKRU: 55555554 <4>[ 9.426538] Call Trace: <4>[ 9.426585] wa_init_mcr+0xd5/0x110 [i915] <4>[ 9.426597] ? lock_acquire+0xa6/0x1c0 <4>[ 9.426645] icl_gt_workarounds_init+0x21/0x1a0 [i915] <4>[ 9.426694] ? i915_driver_load+0xfcf/0x18a0 [i915] <4>[ 9.426739] gt_init_workarounds+0x14c/0x230 [i915] <4>[ 9.426748] ? _raw_spin_unlock_irq+0x24/0x50 <4>[ 9.426789] intel_gt_init_workarounds+0x1b/0x30 [i915] <4>[ 9.426835] i915_driver_load+0xfd7/0x18a0 [i915] <4>[ 9.426843] ? lock_acquire+0xa6/0x1c0 <4>[ 9.426850] ? __pm_runtime_resume+0x4f/0x80 <4>[ 9.426857] ? _raw_spin_unlock_irqrestore+0x4c/0x60 <4>[ 9.426863] ? _raw_spin_unlock_irqrestore+0x4c/0x60 <4>[ 9.426870] ? lockdep_hardirqs_on+0xe3/0x1b0 <4>[ 9.426915] i915_pci_probe+0x29/0xa0 [i915] <4>[ 9.426923] pci_device_probe+0x9e/0x120 <4>[ 9.426930] really_probe+0xea/0x3c0 <4>[ 9.426936] driver_probe_device+0x10b/0x120 <4>[ 9.426942] device_driver_attach+0x4a/0x50 <4>[ 9.426948] __driver_attach+0x97/0x130 <4>[ 9.426954] ? device_driver_attach+0x50/0x50 <4>[ 9.426960] bus_for_each_dev+0x74/0xc0 <4>[ 9.426966] bus_add_driver+0x13f/0x210 <4>[ 9.426971] ? 0xffffffffa083b000 <4>[ 9.426976] driver_register+0x56/0xe0 <4>[ 9.426982] ? 0xffffffffa083b000 <4>[ 9.426987] do_one_initcall+0x58/0x300 <4>[ 9.426994] ? do_init_module+0x1d/0x1f6 <4>[ 9.427001] ? rcu_read_lock_sched_held+0x6f/0x80 <4>[ 9.427007] ? kmem_cache_alloc_trace+0x261/0x290 <4>[ 9.427014] do_init_module+0x56/0x1f6 <4>[ 9.427020] load_module+0x24d1/0x2990 <4>[ 9.427032] ? __se_sys_finit_module+0xd3/0xf0 <4>[ 9.427037] __se_sys_finit_module+0xd3/0xf0 <4>[ 9.427047] do_syscall_64+0x55/0x1c0 <4>[ 9.427053] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 9.427059] RIP: 0033:0x7f73d5609839 <4>[ 9.427064] Code: 00 f3 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 1f f6 2c 00 f7 d8 64 89 01 48 <4>[ 9.427082] RSP: 002b:00007ffdf34477b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 <4>[ 9.427091] RAX: ffffffffffffffda RBX: 00005559fd5d7b40 RCX: 00007f73d5609839 <4>[ 9.427099] RDX: 0000000000000000 RSI: 00007f73d52e8145 RDI: 000000000000000f <4>[ 9.427106] RBP: 00007f73d52e8145 R08: 0000000000000000 R09: 00007ffdf34478d0 <4>[ 9.427114] R10: 000000000000000f R11: 0000000000000246 R12: 0000000000000000 <4>[ 9.427121] R13: 00005559fd5c90f0 R14: 0000000000020000 R15: 00005559fd5d7b40 <4>[ 9.427131] Modules linked in: i915(+) mei_hdcp x86_pkg_temp_thermal coretemp snd_hda_intel crct10dif_pclmul crc32_pclmul snd_hda_codec snd_hwdep e1000e snd_hda_core ghash_clmulni_intel ptp snd_pcm cdc_ether usbnet mii pps_core mei_me mei prime_numbers btusb btrtl btbcm btintel bluetooth ecdh_generic ecc <4>[ 9.427254] ---[ end trace af3eeb543bd66e66 ]--- [1] http://patchwork.freedesktop.org/patch/msgid/20190528200655.11605-1-chris@chris-wilson.co.uk References: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_6159/fi-icl-u2/pstore0-1517155098_Oops_1.log References: 1e40d4aea57b ("drm/i915/cnl: Implement WaProgramMgsrForCorrectSliceSpecificMmioReads") Fixes: 1ac159e23c2c ("drm/i915: Expand subslice mask") Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Manasi Navare <manasi.d.navare@intel.com> Cc: Michel Thierry <michel.thierry@intel.com> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Oscar Mateo <oscar.mateo@intel.com> Cc: Stuart Summers <stuart.summers@intel.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Yunwei Zhang <yunwei.zhang@intel.com> Acked-by: Daniel Vetter <daniel@ffwll.ch> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190529082150.31526-1-jani.nikula@intel.com
|
#
1ac159e2 |
|
24-May-2019 |
Stuart Summers <stuart.summers@intel.com> |
drm/i915: Expand subslice mask Currently, the subslice_mask runtime parameter is stored as an array of subslices per slice. Expand the subslice mask array to better match what is presented to userspace through the I915_QUERY_TOPOLOGY_INFO ioctl. The index into this array is then calculated: slice * subslice stride + subslice index / 8 v2: fix spacing in set_sseu_info args use set_sseu_info to initialize sseu data when building device status in debugfs rename variables in intel_engine_types.h to avoid checkpatch warnings v3: update headers in intel_sseu.h v4: add const to some sseu_dev_info variables use sseu->eu_stride for EU stride calculations v5: address review comments from Tvrtko and Daniele v6: remove extra space in intel_sseu_get_subslices return the correct subslice enable in for_each_instdone add GEM_BUG_ON to ensure user doesn't pass invalid ss_mask size use printk formatted string for subslice mask v7: remove string.h header and rebase Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Signed-off-by: Stuart Summers <stuart.summers@intel.com> Signed-off-by: Manasi Navare <manasi.d.navare@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190524154022.13575-6-stuart.summers@intel.com
|
#
5e5d2e20 |
|
28-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Split GEM object type definition to its own header For convenience in avoiding inline spaghetti, keep the type definition as a separate header. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190528092956.14910-1-chris@chris-wilson.co.uk
|
#
c5d3e39c |
|
22-May-2019 |
Tvrtko Ursulin <tvrtko.ursulin@intel.com> |
drm/i915: Engine discovery query Engine discovery query allows userspace to enumerate engines, probe their configuration features, all without needing to maintain the internal PCI ID based database. A new query for the generic i915 query ioctl is added named DRM_I915_QUERY_ENGINE_INFO, together with accompanying structure drm_i915_query_engine_info. The address of latter should be passed to the kernel in the query.data_ptr field, and should be large enough for the kernel to fill out all known engines as struct drm_i915_engine_info elements trailing the query. As with other queries, setting the item query length to zero allows userspace to query minimum required buffer size. Enumerated engines have common type mask which can be used to query all hardware engines, versus engines userspace can submit to using the execbuf uAPI. Engines also have capabilities which are per engine class namespace of bits describing features not present on all engine instances. v2: * Fixed HEVC assignment. * Reorder some fields, rename type to flags, increase width. (Lionel) * No need to allocate temporary storage if we do it engine by engine. (Lionel) v3: * Describe engine flags and mark mbz fields. (Lionel) * HEVC only applies to VCS. v4: * Squash SFC flag into main patch. * Tidy some comments. v5: * Add uabi_ prefix to engine capabilities. (Chris Wilson) * Report exact size of engine info array. (Chris Wilson) * Drop the engine flags. (Joonas Lahtinen) * Added some more reserved fields. * Move flags after class/instance. v6: * Do not check engine info array was zeroed by userspace but zero the unused fields for them instead. v7: * Simplify length calculation loop. (Lionel Landwerlin) v8: * Remove MBZ comments where not applicable. * Rename ABI flags to match engine class define naming. * Rename SFC ABI flag to reflect it applies to VCS and VECS. * SFC is wired to even _logical_ engine instances. * SFC applies to VCS and VECS. * HEVC is present on all instances on Gen11. (Tony) * Simplify length calculation even more. (Chris Wilson) * Move info_ptr assigment closer to loop for clarity. (Chris Wilson) * Use vdbox_sfc_access from runtime info. * Rebase for RUNTIME_INFO. * Refactor for lower indentation. * Rename uAPI class/instance to engine_class/instance to avoid C++ keyword. v9: * Rebase for s/num_rings/num_engines/ in RUNTIME_INFO. v10: * Use new copy_query_item. v11: * Consolidate with struct i915_engine_class_instnace. Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Tony Ye <tony.ye@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> # v7 Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> Link: https://patchwork.freedesktop.org/patch/msgid/20190522090054.6007-1-tvrtko.ursulin@linux.intel.com
|
#
ee113690 |
|
21-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/execlists: Virtual engine bonding Some users require that when a master batch is executed on one particular engine, a companion batch is run simultaneously on a specific slave engine. For this purpose, we introduce virtual engine bonding, allowing maps of master:slaves to be constructed to constrain which physical engines a virtual engine may select given a fence on a master engine. For the moment, we continue to ignore the issue of preemption deferring the master request for later. Ideally, we would like to then also remove the slave and run something else rather than have it stall the pipeline. With load balancing, we should be able to move workload around it, but there is a similar stall on the master pipeline while it may wait for the slave to be executed. At the cost of more latency for the bonded request, it may be interesting to launch both on their engines in lockstep. (Bubbles abound.) Opens: Also what about bonding an engine as its own master? It doesn't break anything internally, so allow the silliness. v2: Emancipate the bonds v3: Couple in delayed scheduling for the selftests v4: Handle invalid mutually exclusive bonding v5: Mention what the uapi does v6: s/nbond/num_bonds/ Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-9-chris@chris-wilson.co.uk
|
#
6d06779e |
|
21-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Load balancing across a virtual engine Having allowed the user to define a set of engines that they will want to only use, we go one step further and allow them to bind those engines into a single virtual instance. Submitting a batch to the virtual engine will then forward it to any one of the set in a manner as best to distribute load. The virtual engine has a single timeline across all engines (it operates as a single queue), so it is not able to concurrently run batches across multiple engines by itself; that is left up to the user to submit multiple concurrent batches to multiple queues. Multiple users will be load balanced across the system. The mechanism used for load balancing in this patch is a late greedy balancer. When a request is ready for execution, it is added to each engine's queue, and when an engine is ready for its next request it claims it from the virtual engine. The first engine to do so, wins, i.e. the request is executed at the earliest opportunity (idle moment) in the system. As not all HW is created equal, the user is still able to skip the virtual engine and execute the batch on a specific engine, all within the same queue. It will then be executed in order on the correct engine, with execution on other virtual engines being moved away due to the load detection. A couple of areas for potential improvement left! - The virtual engine always take priority over equal-priority tasks. Mostly broken up by applying FQ_CODEL rules for prioritising new clients, and hopefully the virtual and real engines are not then congested (i.e. all work is via virtual engines, or all work is to the real engine). - We require the breadcrumb irq around every virtual engine request. For normal engines, we eliminate the need for the slow round trip via interrupt by using the submit fence and queueing in order. For virtual engines, we have to allow any job to transfer to a new ring, and cannot coalesce the submissions, so require the completion fence instead, forcing the persistent use of interrupts. - We only drip feed single requests through each virtual engine and onto the physical engines, even if there was enough work to fill all ELSP, leaving small stalls with an idle CS event at the end of every request. Could we be greedy and fill both slots? Being lazy is virtuous for load distribution on less-than-full workloads though. Other areas of improvement are more general, such as reducing lock contention, reducing dispatch overhead, looking at direct submission rather than bouncing around tasklets etc. sseu: Lift the restriction to allow sseu to be reconfigured on virtual engines composed of RENDER_CLASS (rcs). v2: macroize check_user_mbz() v3: Cancel virtual engines on wedging v4: Commence commenting v5: Replace 64b sibling_mask with a list of class:instance v6: Drop the one-element array in the uabi v7: Assert it is an virtual engine in to_virtual_engine() v8: Skip over holes in [class][inst] so we can selftest with (vcs0, vcs2) Link: https://github.com/intel/media-driver/pull/283 Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190521211134.16117-6-chris@chris-wilson.co.uk
|
#
519a0194 |
|
08-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/hangcheck: Replace hangcheck.seqno with RING_HEAD After realising we need to sample RING_START to detect context switches from preemption events that do not allow for the seqno to advance, we can also realise that the seqno itself is just a distance along the ring and so can be replaced by sampling RING_HEAD. v2: Bonus comment for the mystery separate CS_STALL before MI_USER_INTERRUPT Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190508080704.24223-1-chris@chris-wilson.co.uk
|
#
f4107766 |
|
30-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/hangcheck: Track context changes Given sufficient preemption, we may see a busy system that doesn't advance seqno while performing work across multiple contexts, and given sufficient pathology not even notice a change in ACTHD. What does change between the preempting contexts is their RING, so take note of that and treat a change in the ring address as being an indication of forward progress. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190501114541.10077-1-chris@chris-wilson.co.uk
|
#
45b9c968 |
|
01-May-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move the engine->destroy() vfunc onto the engine Make the engine responsible for cleaning itself up! This removes the i915->gt.cleanup vfunc that has been annoying the casual reader and myself for the last several years, and helps keep a future patch to add more cleanup tidy. v2: Assert that engine->destroy is set after the backend starts allocating its own state. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190501103204.18632-1-chris@chris-wilson.co.uk
|
#
79ffac85 |
|
24-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Invert the GEM wakeref hierarchy In the current scheme, on submitting a request we take a single global GEM wakeref, which trickles down to wake up all GT power domains. This is undesirable as we would like to be able to localise our power management to the available power domains and to remove the global GEM operations from the heart of the driver. (The intent there is to push global GEM decisions to the boundary as used by the GEM user interface.) Now during request construction, each request is responsible via its logical context to acquire a wakeref on each power domain it intends to utilize. Currently, each request takes a wakeref on the engine(s) and the engines themselves take a chipset wakeref. This gives us a transition on each engine which we can extend if we want to insert more powermangement control (such as soft rc6). The global GEM operations that currently require a struct_mutex are reduced to listening to pm events from the chipset GT wakeref. As we reduce the struct_mutex requirement, these listeners should evaporate. Perhaps the biggest immediate change is that this removes the struct_mutex requirement around GT power management, allowing us greater flexibility in request construction. Another important knock-on effect, is that by tracking engine usage, we can insert a switch back to the kernel context on that engine immediately, avoiding any extra delay or inserting global synchronisation barriers. This makes tracking when an engine and its associated contexts are idle much easier -- important for when we forgo our assumed execution ordering and need idle barriers to unpin used contexts. In the process, it means we remove a large chunk of code whose only purpose was to switch back to the kernel context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Cc: Imre Deak <imre.deak@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424200717.1686-5-chris@chris-wilson.co.uk
|
#
112ed2d3 |
|
24-Apr-2019 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Move GraphicsTechnology files under gt/ Start partitioning off the code that talks to the hardware (GT) from the uapi layers and move the device facing code under gt/ One casualty is s/intel_ringbuffer.h/intel_engine.h/ with the plan to subdivide that header and body further (and split out the submission code from the ringbuffer and logical context handling). This patch aims to be simple motion so git can fixup inflight patches with little mess. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Acked-by: Jani Nikula <jani.nikula@intel.com> Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190424174839.7141-1-chris@chris-wilson.co.uk
|