#
d3f23ab9 |
|
20-Jul-2023 |
Andrzej Hajda <andrzej.hajda@intel.com> |
drm/i915: use direct alias for i915 in requests i915_request contains direct alias to i915, there is no point to go via rq->engine->i915. v2: added missing rq.i915 initialization in measure_breadcrumb_dw. Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Acked-by: Nirmoy Das <nirmoy.das@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230720113002.1541572-1-andrzej.hajda@intel.com
|
#
ce2fce25 |
|
27-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915: Only include i915_reg.h from .c files Several of our i915 header files, have been including i915_reg.h. This means that any change to i915_reg.h will trigger a full rebuild of pretty much every file of the driver, even those that don't have any kind of register access. Let's delete the i915_reg.h include from all headers and add an explicit include from the .c files that truly need the register definitions; those that need a definition of i915_reg_t for a function definition can get it from i915_reg_defs.h instead. We also remove two non-register #define's (VLV_DISPLAY_BASE and GEN12_SFC_DONE_MAX) into i915_reg_defs.h to allow us to drop the i915_reg.h include from a couple of headers. There's probably a lot more header dependency optimization possible, but the changes here roughly cut the number of files compiled after 'touch i915_reg.h' in half --- a good first step. Cc: Jani Nikula <jani.nikula@intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-7-matthew.d.roper@intel.com
|
#
202b1f4c |
|
10-Jan-2022 |
Matt Roper <matthew.d.roper@intel.com> |
drm/i915/gt: Move engine registers to their own header Let's continue breaking up and cleaning up the massive i915_reg.h file by moving all registers that are defined in relation to an engine base to their own header. There are probably a bunch of other "engine registers" that we haven't moved yet (especially those that belong to the render engine in the 0x2??? range), but this is a relatively straightforward first step. Cc: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Matt Roper <matthew.d.roper@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com
|
#
c816723b |
|
05-Jun-2021 |
Lucas De Marchi <lucas.demarchi@intel.com> |
drm/i915/gt: replace IS_GEN and friends with GRAPHICS_VER This was done by the following semantic patch: @@ expression i915; @@ - INTEL_GEN(i915) + GRAPHICS_VER(i915) @@ expression i915; expression E; @@ - INTEL_GEN(i915) >= E + GRAPHICS_VER(i915) >= E @@ expression dev_priv; expression E; @@ - !IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) != E @@ expression dev_priv; expression E; @@ - IS_GEN(dev_priv, E) + GRAPHICS_VER(dev_priv) == E @@ expression dev_priv; expression from, until; @@ - IS_GEN_RANGE(dev_priv, from, until) + IS_GRAPHICS_VER(dev_priv, from, until) @def@ expression E; identifier id =~ "^gen$"; @@ - id = GRAPHICS_VER(E) + ver = GRAPHICS_VER(E) @@ identifier def.id; @@ - id + ver It also takes care of renaming the variable we assign to GRAPHICS_VER() so to use "ver" rather than "gen". Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com> Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210605155356.4183026-2-lucas.demarchi@intel.com
|
#
12ca695d |
|
23-Mar-2021 |
Maarten Lankhorst <maarten.lankhorst@linux.intel.com> |
drm/i915: Do not share hwsp across contexts any more, v8. Instead of sharing pages with breadcrumbs, give each timeline a single page. This allows unrelated timelines not to share locks any more during command submission. As an additional benefit, seqno wraparound no longer requires i915_vma_pin, which means we no longer need to worry about a potential -EDEADLK at a point where we are ready to submit. Changes since v1: - Fix erroneous i915_vma_acquire that should be a i915_vma_release (ickle). - Extra check for completion in intel_read_hwsp(). Changes since v2: - Fix inconsistent indent in hwsp_alloc() (kbuild) - memset entire cacheline to 0. Changes since v3: - Do same in intel_timeline_reset_seqno(), and clflush for good measure. Changes since v4: - Use refcounting on timeline, instead of relying on i915_active. - Fix waiting on kernel requests. Changes since v5: - Bump amount of slots to maximum (256), for best wraparounds. - Add hwsp_offset to i915_request to fix potential wraparound hang. - Ensure timeline wrap test works with the changes. - Assign hwsp in intel_timeline_read_hwsp() within the rcu lock to fix a hang. Changes since v6: - Rename i915_request_active_offset to i915_request_active_seqno(), and elaborate the function. (tvrtko) Changes since v7: - Move hunk to where it belongs. (jekstrand) - Replace CACHELINE_BYTES with TIMELINE_SEQNO_BYTES. (jekstrand) Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Reviewed-by: Thomas Hellström <thomas.hellstrom@intel.com> #v1 Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20210323155059.628690-2-maarten.lankhorst@linux.intel.com
|
#
2267f684 |
|
12-Jun-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Flush gen3 relocs harder, again gen3 does not fully flush MI stores to memory on MI_FLUSH, such that a subsequent read from e.g. the sampler can bypass the store and read the stale value from memory. This is a serious issue when we are using MI stores to rewrite the batches for relocation, as it means that the batch is reading from random user/kernel memory. While it is particularly sensitive [and detectable] for relocations, reading stale data at any time is a worry. Having started with a small number of delaying stores and doubling until no more incoherency was seen over a few hours (with and without background memory pressure), 32 was the magic number. Note that it definitely doesn't fix the issue, merely adds a long delay between requests, sufficient to mostly hide the problem, enough to raise the mtbf to several hours. This is merely a stop gap. v2: Follow more closer with the gen5 w/a and include some post-invalidate flushes as well. Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2018 References: a889580c087a ("drm/i915: Flush GPU relocs harder for gen3") Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Reviewed-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200612123949.7093-1-chris@chris-wilson.co.uk
|
#
5a833995 |
|
02-Jun-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915: Drop i915_request.i915 backpointer We infrequently use the direct i915 backpointer from the i915_request, so do we really need to waste the space in the struct for it? 8 bytes from the most frequently allocated struct vs an 3 bytes and pointer chasing in using rq->engine->i915? Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Akeem G Abodunrin <akeem.g.abodunrin@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200602220953.21178-1-chris@chris-wilson.co.uk
|
#
c1f85878 |
|
01-Jun-2020 |
Chris Wilson <chris@chris-wilson.co.uk> |
drm/i915/gt: Split low level gen2-7 CS emitters Pull the routines for writing CS packets out of intel_ring_submission into their own files. These are low level operations for building CS instructions, rather than the logic for filling the global ring buffer with requests, and we will want to reuse them outside of this context. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Acked-by: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200601072446.19548-2-chris@chris-wilson.co.uk
|